-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize CI/PROD image waiting and verification in CI workflow #35856
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5e93483
to
fb0a155
Compare
These diagrams: https://github.com/apache/airflow/blob/optimize-image-wait-verify/CI_DIAGRAMS.md and the https://github.com/apache/airflow/blob/optimize-image-wait-verify/CI.rst are up-to-date and describe very accurately the proposed changes to CI workflow, so if someonw would like to dive into details - I recommend them as explanation. |
102e106
to
9f206c2
Compare
4b63d7c
to
70d94de
Compare
Currently both "wait-for-ci-images" and "preview-constraints" jobs are waiting for images to be built - which means that they both take a running worker slot (public runner) just to do the waiting while the image is being built. Also "verify-image" job is run as part of "wait-for-image" which adds additional delay between being downloaded and dependent jobs starting. This PR optimizes it quite a bit: * preview-constraints job now depends on "wait-for-ci-images". This means that only one slot will be busy while waiting for images. * both CI and PROD `verify-image` commands in breeze got --run-in-parallel set of flags that allow the verification to happen for all images in parallel. * Image verification is added as separate step in jobs that already need to pull the images to do other stuff. For CI Image it's "Preview constraints" and for PROD image it is "Test Docker compose job". The fact that they are not run as part of "wait for image" jobs allows us to start the other jobs faster but also to not let failure in image verification block other tests from running. * In case of the "in-workflow-build" the "wait-for-ci-images" does not have to be run at all, because there wait-for-ci-images depends on in-workflow build-ci-images job - so if that job completes, we know image is built already and we do not have to wait for it separately - so far we had to run it in order to add `--verify` flag to verify the images. With separate job we can run i in parallel to all the other waiting jobs. * Also names and dependencies between jobs are updated, including CI documentation describing diagrams of how CI workflows work. The diagrams are cleaned-up/verified and updated. The separate diagram for scheduled build has been removed as it was essentially the same as "canary build". A paragraph description for every type of workflow was added to add more context to the diagrams.
70d94de
to
7b85629
Compare
Taragolis
approved these changes
Nov 26, 2023
hussein-awala
approved these changes
Nov 26, 2023
potiuk
added a commit
to potiuk/airflow
that referenced
this pull request
Nov 26, 2023
Small follow-up after apache#35856 - cache building uses constraints so rather than describing it, we add arrow-dependency
potiuk
added a commit
that referenced
this pull request
Nov 26, 2023
Small follow-up after #35856 - cache building uses constraints so rather than describing it, we add arrow-dependency
potiuk
added a commit
to potiuk/airflow
that referenced
this pull request
Dec 7, 2023
The change apache#35856 optimized waiting time before PROD image builds start - rather than waiting for full constratints generation, the PROD image building just used source constraints generated right after building the CI image quickly. This is fine for main because there we install airflow and packages using constraints from sources, but for release branches we use the provider constraints - in order to be able to install providers from PyPI rather than from sources. This means that we have to wait for constraints generation to complete before we start building PROD images - because we need to download the constraints generated there to use them. Unfortunately GitHub Actions do not have conditional dependencies depending on where the workflow is run - so instead we have to effectively duplicate PROD build steps and skip steps in them instead.
potiuk
added a commit
that referenced
this pull request
Dec 7, 2023
The change #35856 optimized waiting time before PROD image builds start - rather than waiting for full constratints generation, the PROD image building just used source constraints generated right after building the CI image quickly. This is fine for main because there we install airflow and packages using constraints from sources, but for release branches we use the provider constraints - in order to be able to install providers from PyPI rather than from sources. This means that we have to wait for constraints generation to complete before we start building PROD images - because we need to download the constraints generated there to use them. Unfortunately GitHub Actions do not have conditional dependencies depending on where the workflow is run - so instead we have to effectively duplicate PROD build steps and skip steps in them instead.
potiuk
added a commit
that referenced
this pull request
Dec 7, 2023
The change #35856 optimized waiting time before PROD image builds start - rather than waiting for full constratints generation, the PROD image building just used source constraints generated right after building the CI image quickly. This is fine for main because there we install airflow and packages using constraints from sources, but for release branches we use the provider constraints - in order to be able to install providers from PyPI rather than from sources. This means that we have to wait for constraints generation to complete before we start building PROD images - because we need to download the constraints generated there to use them. Unfortunately GitHub Actions do not have conditional dependencies depending on where the workflow is run - so instead we have to effectively duplicate PROD build steps and skip steps in them instead.
ephraimbuddy
pushed a commit
that referenced
this pull request
Dec 7, 2023
The change #35856 optimized waiting time before PROD image builds start - rather than waiting for full constratints generation, the PROD image building just used source constraints generated right after building the CI image quickly. This is fine for main because there we install airflow and packages using constraints from sources, but for release branches we use the provider constraints - in order to be able to install providers from PyPI rather than from sources. This means that we have to wait for constraints generation to complete before we start building PROD images - because we need to download the constraints generated there to use them. Unfortunately GitHub Actions do not have conditional dependencies depending on where the workflow is run - so instead we have to effectively duplicate PROD build steps and skip steps in them instead.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently both "wait-for-ci-images" and "preview-constraints" jobs are waiting for images to be built - which means that they both take a running worker slot (public runner) just to do the waiting while the image is being built. Also "verify-image" job is run as part of "wait-for-image" which adds additional delay between being downloaded and dependent jobs starting.
This PR optimizes it quite a bit:
preview-constraints job now depends on "wait-for-ci-images". This means that only one slot will be busy while waiting for images.
both CI and PROD
verify-image
commands in breeze got --run-in-parallel set of flags that allow the verification to happen for all images in parallel.Image verification is added as separate step in jobs that already need to pull the images to do other stuff. For CI Image it's "Preview constraints" and for PROD image it is "Test Docker compose job". The fact that they are not run as part of "wait for image" jobs allows us to start the other jobs faster but also to not let failure in image verification block other tests from running.
In case of the "in-workflow-build" the "wait-for-ci-images" does not have to be run at all, because there wait-for-ci-images depends on in-workflow build-ci-images job - so if that job completes, we know image is built already and we do not have to wait for it separately - so far we had to run it in order to add
--verify
flag to verify the images. With separate job we can run i in parallel to all the other waiting jobs.Also names and dependencies between jobs are updated, including CI documentation describing diagrams of how CI workflows work. The diagrams are cleaned-up/verified and updated. The separate diagram for scheduled build has been removed as it was essentially the same as "canary build". A paragraph description for every type of workflow was added to add more context to the diagrams.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.