-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Huge slowdown of dbt initialization for projects with many tests since v0.21 #4135
Comments
@dvalchev Thanks for the detailed report! I'm glad to see that the improvement in #4022 cut down the graph compilation time significantly, though I'm unpleasantly surprised to see this line is still chewing up 54.1s: dbt-core/core/dbt/graph/graph.py Line 85 in 73af9a5
When I run a sample project with 2k models and 6k tests on my machine locally, using In order to achieve the test-blocking behavior desirable in I think there are a few paths we could pursue:
I don't think we'd be able to do both (2) and (3); if selection criteria impacts the graph we build, then we can't reuse it in subsequent invocations with different selection criteria. From my point of view, all options are on the table, including ones not listed here. We've been working hard on improving performance all year, and I'd like to see us carry that through into v1.0. |
Thanks @jtcohen6 for the quick response! |
@dvalchev We think we've come up with a performance boost that doesn't require one of the trade-offs I mention above. If you get a moment, could you take it for a spin? #4155 I see a 5-10x speedup on my machine. Granted, that reduction is from 5.5s to 0.5-1s, so I'm interested to know what it looks like when starting from 54s. |
Thanks for giving the new code a spin @dvalchev although I confess I'm a little surprised at your results. I have a couple of thoughts/questions that may help us proceed here. 1 - Can you try applying that patch to 1.0-b2? That's the version I've been developing against and would ensure we're working from the same baseline. 2 - Can you share a little more about the structure of your project? Differently shaped graphs can impact the algos we use in different ways. 3 - It strikes me that the % of the overall run time that get_subset_graph consumes has gotten smaller but the overall runtime got larger. Is there any chance that there was some resource contention going on during that second run? 4 - I sincerely hope this isn't the case, but the other obvious difference in what we're doing and what you're doing is the platform (we're all macos at dbtlabs). |
@iknox-fa, there you go:
I have tested using directly your
The specifics with our project which I think causes the issue are as follows:
I deliberately ran the tests multiple times and tried to capture the best possible result to minimize any resource contention impact during the test runs. As I mentioned earlier, the issue with the overall duration increase is caused by the additional re-parsing of more than 50% of the models. It seems that the changes made in #4155 is causing on its own the invalidation of the parsing cache. Every run (with no changes to the models whatsoever) does this unnecessary re-parsing:
I have tested this on both Windows 10 and RedHat Enterprise Linux 8.2 machines and the behavior is consistent. I think we can rule out any environmental dependency. |
@jtcohen6, while testing, I noticed another potential bug which seems to be a side effect of the graph processing changes. The lineage graph in the generated docs is working only when focused on models with few dependencies. Anything beyond that results in something like: Dragging dots around results in: All works fine until 0.21 and is broken in 1.0.0-b2 (haven't tested with b1) |
@dvalchev That's very weird! The dbt-docs site doesn't use the networkx graph at all. The DAG visualization is powered completely by the I also don't know why the change in #4155 is causing dbt to detect You're welcome to open a new issue. I just worry I'm going to have a hard time reproducing. |
We don't directly use env vars in the models and macros. We only use them to pass the initial values to the internal dbt vars in dbt_project.yml. We use the internal vars extensively however for both setting the table aliases as well as for incremental logic. Still the reparsing behavior makes little sense as none of the env vars are changed between the runs. |
Is there an existing issue for this?
Current Behavior
This is a continuation of the closed #4012.
Since the upgrade from v0.20 to v0.21, when the project has a lot of schema tests, dbt startup time is severely increased. In one of the projects (343 models, 3882 tests) the startup time increased from 7 seconds to over 3 minutes! Using the timing recording, I managed to track down the root cause to #4022 (
compilation.py:114(<listcomp>)
). In addition, it seems that thedigraph.py:638(add_edges_from)
also performs significantly worse as well.With #4012, the first issue seems to be resolved and the startup gets faster but is still impacted by the
digraph.py:638(add_edges_from)
.Disabling all tests makes the performance on par with 0.20.
Here are the details from each version tests. I used pre-parsed simple
dbt compile
to showcase the issue:0.20.2
0.21
1.0.0-b2
Expected Behavior
dbt startup performance is on par with 0.20
Steps To Reproduce
Execute
dbt compile
ordbt run
on a project with a large number of testsRelevant log output
Environment
What database are you using dbt with?
snowflake
Additional Context
No response
The text was updated successfully, but these errors were encountered: