Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance regression by parallelizing alias checkout #1101

Merged
merged 4 commits into from
Feb 14, 2025

Conversation

simu
Copy link
Member

@simu simu commented Feb 12, 2025

This is necessary now that setting up aliases requires Git checkout and potentially even Git clone operations. This reduces the "Fetching components..." stage for a cluster with 60 component instances from ~12s to ~4s on my system (clones unchanged at ~3s, alias setup from ~9s to ~1s).

We change fetch_components to collect aliases and their backing components and use the same ThreadPoolExecutor approach that we use for the component checkouts.

We aggressively parallelize the alias checkouts for aliases which use a dependency URL that we've already cloned during the component checkout phase, and group aliases with the same not-yet-cloned dependency URL to avoid race conditions.

To make this work, we need to refactor the Component alias handling a bit, so that we have enough information to do the actual alias_checkout() in a parallelized loop.

Checklist

  • Keep pull requests small so they can be easily reviewed.
  • Update the documentation.
  • Update tests.
  • Categorize the PR by setting a good title and adding one of the labels:
    bug, enhancement, documentation, change, breaking, dependency, internal
    as they show up in the changelog

@simu simu force-pushed the feat/parallel-alias-setup branch from 3ebe0c5 to 3f0e27a Compare February 14, 2025 09:53
@simu simu changed the base branch from master to fix/component-dependency-nonoptional February 14, 2025 09:53
@simu simu added the bug Something isn't working label Feb 14, 2025
@simu simu changed the title Parallelize alias setup Fix performance regression by parallelizing alias checkout Feb 14, 2025
Base automatically changed from fix/component-dependency-nonoptional to master February 14, 2025 11:50
@simu simu force-pushed the feat/parallel-alias-setup branch from 3f0e27a to 0f9b03b Compare February 14, 2025 11:52
This change will allow us to parallelize `alias_checkout()`.
@simu simu force-pushed the feat/parallel-alias-setup branch 2 times, most recently from 08bfdcb to 46f96a4 Compare February 14, 2025 12:16
@simu simu marked this pull request as ready for review February 14, 2025 12:28
@simu simu requested a review from a team as a code owner February 14, 2025 12:28
@simu simu requested a review from a team February 14, 2025 12:28
simu added 3 commits February 14, 2025 13:42
We collect aliases and their backing components and use the same
ThreadPoolExecutor approach that we use for the component checkouts.

We aggressively parallelize the alias checkouts for aliases which use a
dependency URL that we've already cloned during the component checkout
phase, and group aliases with the same not-yet-cloned dependency URL to
avoid race conditions.

We rename the wrapper function which does the parallel work to
`do_parallel()` since it doesn't necessarily fetch anything anymore.

Additionally we add type annotations to `do_parallel()` to make it more
obvious how the arguments are supposed to look like.
We simply assert that `apath` is not None to signal to mypy that there's
no possible way that the alias isn't registered on the alias
multi-dependency (and therefore `adep.get_component(alias) will never
return None) when the alias is present in `_aliases`.
@simu simu force-pushed the feat/parallel-alias-setup branch from 46f96a4 to 4d5f0fc Compare February 14, 2025 12:43
@simu simu enabled auto-merge February 14, 2025 12:47
@simu simu merged commit 6282bd3 into master Feb 14, 2025
18 checks passed
@simu simu deleted the feat/parallel-alias-setup branch February 14, 2025 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants