Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise YAML config parsing #754

Merged
merged 5 commits into from
Sep 25, 2024
Merged

Optimise YAML config parsing #754

merged 5 commits into from
Sep 25, 2024

Conversation

mauch
Copy link
Contributor

@mauch mauch commented Sep 25, 2024

Optimise read of YAML configuration at the start of the pipeline by copying the configuration into a dictionary before reading or updating it. This prevents strictyaml from validating the YAML on every read or write and only lets the validation happen on bulk reads or writes.

This closes #751

…dictionary to YAML

This greatly speeds up the initial configuration step since the expansion of
glob expressions for the inputs into thousands of files creates very large
YAML objects that strictyaml is slow to validate. The changes here ensure the
validation step is only done once when writing the updated (expanded glob)
file lists back to _yaml["inputs"].
Each creation of the 'inputs' dictionary in a PipeLine config object
can take a very long time since safeyaml is extremely slow to validate
long file lists when converting to a python list. The PipelineConfig.validation
method recreates the ['inputs'] dictionary many times. Just do it once at the start
then use that dictionary throughout.
Just because I was changing things in this method anyway.
@mauch mauch requested a review from ddobie September 25, 2024 00:20
@ddobie
Copy link
Contributor

ddobie commented Sep 25, 2024

Approved - just FYI we usually use a "Squash and merge" when merging PRs rather than a merge commit or a rebase

@mauch mauch merged commit ab05418 into dev Sep 25, 2024
5 checks passed
@mauch mauch deleted the fix_751_yaml_config branch September 25, 2024 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix slow validation of YAML config
2 participants