Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2141] [Spike] Sanity check on full parse performance #7005

Closed
jtcohen6 opened this issue Feb 17, 2023 · 3 comments
Closed

[CT-2141] [Spike] Sanity check on full parse performance #7005

jtcohen6 opened this issue Feb 17, 2023 · 3 comments
Assignees
Labels
performance spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@jtcohen6
Copy link
Contributor

Anecdotally we've noticed (or can imagine) some potential for slowdown in parsing (namely "full parses") since we last had parsing performance as a top-level priority.

As a first go, we can do some basic benchmarking on a large (ideally real-world) project, comparing the main branch versus v1.0.

@jtcohen6 jtcohen6 added performance tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality Team:Language labels Feb 17, 2023
@github-actions github-actions bot changed the title Sanity check on full parse performance [CT-2141] Sanity check on full parse performance Feb 17, 2023
@jtcohen6 jtcohen6 added the spike label Feb 17, 2023
@jtcohen6 jtcohen6 changed the title [CT-2141] Sanity check on full parse performance [CT-2141] [Spike] Sanity check on full parse performance Feb 17, 2023
@peterallenwebb
Copy link
Contributor

@jtcohen6 Good news, I think.

Below are excerpts of the perf_info report when running dbt --no-partial-parse --no-version-check parse on the internal_analytics project for the latest 1.3, 1.4, and 1.5 releases. I have speedscope profiles for each of the runs listed here, and I'll take a closer look at them, but I don't see any reason at this level of analysis to suspect a performance degradation from 1.3 -> 1.5.

dbt 1.3.3:

    "is_partial_parse_enabled": false,
    "is_static_analysis_enabled": true,
    "read_files_elapsed": 0.7166596669703722,
    "load_macros_elapsed": 1.905235208105296,
    "parse_project_elapsed": 14.996926792198792,
    "patch_sources_elapsed": 0.1171803749166429,
    "process_manifest_elapsed": 0.7198715419508517,
    "load_all_elapsed": 19.084472500020638,

dbt 1.4.5:

    "is_partial_parse_enabled": false,
    "is_static_analysis_enabled": true,
    "read_files_elapsed": 0.7186416250187904,
    "load_macros_elapsed": 1.8487964579835534,
    "parse_project_elapsed": 15.356004958041012,
    "patch_sources_elapsed": 0.12841375009156764,
    "process_manifest_elapsed": 0.6919153342023492,
    "load_all_elapsed": 19.337341208010912,

dbt 1.5.0rc1:

    "is_partial_parse_enabled": false,
    "is_static_analysis_enabled": true,
    "read_files_elapsed": 0.7262364169582725,
    "load_macros_elapsed": 1.8892155408393592,
    "parse_project_elapsed": 15.864262333139777,
    "patch_sources_elapsed": 0.1376477919984609,
    "process_manifest_elapsed": 0.7611132920719683,
    "load_all_elapsed": 20.021727832965553,

@peterallenwebb
Copy link
Contributor

peterallenwebb commented Apr 17, 2023

Looking at the profiles for the above runs, I didn't see much difference between them, and for the most part we are spending time in the jinja library code.

There is one interesting quirk that may be worth mentioning. We are spending almost 10% of the execution time in MacroGenerator.__init__(), which I presume is because it is called so many times. The function itself does almost nothing.

In summary, there is no evidence of recent regressions in parse performance, but a review of the higher level algorithms could allow us to do less rework during parsing, and rewriting some of the lower level processing in native-compiled code could still be considered if this is a pain point.

@jtcohen6
Copy link
Contributor Author

@peterallenwebb That is good news!! Thank you for looking into this in detail.

My suspicion about MacroGenerator.__init__() is that we're passing in the rendering context (dictionary), which can be big. And we do it over and over and over again, because we have to create a slightly different context for every node (even if it's only a handful of context attributes that actually differ). If I recall correctly, @gshank saw this and had some thoughts, back when performance speedup was our top priority in 2021.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

2 participants