Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: use new Docker images from design document #8453

Merged
merged 42 commits into from
Sep 27, 2021

Conversation

humitos
Copy link
Member

@humitos humitos commented Aug 31, 2021

Minimal implementation for POC of #8447

It uses the new config file (#8478) to select the new build.os image and install Python versions via asdf (readthedocs/readthedocs-docker-images#166) using build.tools config option.

MinIO requires a new bucket called build-tools with a pre-compiled versions to work (compiled with ./scripts/compile_version_upload_s3.sh). If a version that's not pre-compiled is selected it will be downloaded from official mirrors, installed and used.

Build times on latest version for test-build:

  • using the new image + cached Python version: 112s
  • using the new image + non cached Python version: 288s
  • using old image (current production): 87s

Note that all the parsing of the Config File to support build.os and build.tools is not included in this PR on purpose. That work can be split as a separate work and done in parallel with the rest of work required here.

Testing this PR

  1. build the image from New image structure based on design document readthedocs-docker-images#166
  2. spin up your local development instance
  3. go to your MinIO instance (http://localhost:9000/minio/) and create a new bucket called build-tools
  4. run ./scripts/compile_version_upload_s3.sh python 3.8.12
  5. merge this other PR New config for new docker build images #8478 into this one
  6. build test-builds project with humitos/new-config-file branch/version

TODO

Note that not all these points are going to be done on this particular PR, but it's a good way to track what else is required

Closes #7239

@humitos humitos requested a review from a team August 31, 2021 12:34
@humitos humitos force-pushed the humitos/build-new-docker-image branch from da1e227 to c3cec8a Compare September 1, 2021 09:12
Minimal implementation for POC of
#8447

It uses a Feature flag for now as a way to select the new
`readthedocs/build:ubuntu20` image and install Python versions via `asdf`
(readthedocs/readthedocs-docker-images#166)

MinIO requires a new bucket called `languages` with a pre-compiled Python 3.9.6
version to work (*) (this version is hardcoded for now). However, if a different
version is selected it will be downloaded from official mirrors, installed and
used.

Build times on `latest` version for `test-build`:

* using the new image + cached Python version: 112s
* using the new image + non cached Python version: 288s
* using old image (current production): 87s

> Note that all the parsing of the Config File to support `build.os` and
> `build.languages` is not included in this PR on purpose. That work can be
> split as a separate work and done in parallel with the rest of work required
> here.

(*) to pre-compile a Python version:

```bash
docker run -it readthedocs/build:ubuntu20 /bin/bash

asdf install python 3.9.6
asdf global python 3.9.6
python -m pip install -U pip setuptools virtualenv
cd /home/docs/.asdf/installs/python
tar -cfvz ubuntu20-python-3.9.6.tar.gz 3.9.6

docker cp <container id>:/home/docs/.asdf/installs/python/ubuntu20-python-3.9.6.tar.gz .
```

and upload the .tar.gz file to MinIO `languages` bucket using the web interface
Copy link
Contributor

@agjohnson agjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things look good so far! I haven't pulled this down to test however.

I did note it might be worth using a different name for languages so that it doesn't conflict with languages as in "locale languages" or "RTD project language". A more specific term wouldn't be as overloaded already.

Makes the name more specific and reduces potential confusion with other type of
languages.
@humitos humitos requested a review from a team September 20, 2021 15:59
@humitos
Copy link
Member Author

humitos commented Sep 20, 2021

Note for the future: if downloading .tar.gz on each build is a problem, we can pre-download them on builder startup instead and make them available for the Docker containers to just mv them.

@ericholscher
Copy link
Member

Note for the future: if downloading .tar.gz on each build is a problem, we can pre-download them on builder startup instead and make them available for the Docker containers to just mv them.

This is a good idea. We could also do it at image build time, to make sure it's consistent. We have a few options, but I do think having them cached on disk is going to be a win in the future, but let's not block on that for now.

stsewd added a commit that referenced this pull request Sep 21, 2021
Docs need to be updated, but I'm not doing that here, since this isn't implemented yet.

Required by #8453

Co-authored-by: Manuel Kaufmann <[email protected]>
When installing miniconda3, `pyenv-build` calls:

  conda install --yes pip

which automatically updates conda to its latest version and making us lossing
our pinned version.

This commit adds a patch to keep installing pip after conda install, but pinning
conda to the current version to avoid upgrading it automatically.

See pyenv/pyenv#2070
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks close to ready -- I had a lot of nitpicks, but nothing super important that should block merging.

I do worry that we don't have an obvious idea of what versions will be cached at the start. Should we have a script that just contains the commands that we're planning to run on production to create cached versions of all the tools that we're planning to support?

@humitos
Copy link
Member Author

humitos commented Sep 27, 2021

I do worry that we don't have an obvious idea of what versions will be cached at the start. Should we have a script that just contains the commands that we're planning to run on production to create cached versions of all the tools that we're planning to support?

My plan is to cache all the current supported versions. That would be at most 1Gb on S3. We do have a command ./scripts/compile_version_upload_s3.sh that has to run with all the tools and versions once from production --I've already added noted this in the release.

The proposed patch was merged upstream and it's not required anymore.
See pyenv/pyenv#2074
@humitos
Copy link
Member Author

humitos commented Sep 27, 2021

Thanks @ericholscher for the review on this. I'll merge now so we can deploy tomorrow and start doing some QA. I'll catch the suggestions in another PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mamba usage to improve performance (memory & speed)
4 participants