Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize render_stacktrace() #1571

Merged
merged 1 commit into from
Jan 12, 2022

Conversation

adamchainz
Copy link
Contributor

This function is called by the SQL panel to render each frame of each stack trace.

Previously it used a Django template, which is slow. By generating the HTML with pure Python code, we can avoid much of this overhead.

I tried this change on a demo app I built that has 96 total queries, due to N+1 queries. The queries execute in 13ms due to use of SQLite, so the whole view is a negligible concern. Nearly all the view runtime is the toolbar itself. Without this change, the runtime is ~1300ms; with the change it’s ~1100ms. That's a saving of 15%.

I also checked the appearance of the generated HTML hasn’t changed.

@adamchainz adamchainz force-pushed the optimize_sql_panel_1 branch from 4289268 to 6e4830e Compare January 12, 2022 09:45
This function is called by the SQL panel to render each frame of each stack trace.

Previously it used a Django template, which is slow. By generating the HTML with pure Python code, we can avoid much of this overhead.

I tried this change on a demo app I built that has 96 total queries, due to N+1 queries. The queries execute in 13ms due to use of SQLite, so the whole view is a negligible concern. Nearly all the view runtime is the toolbar itself. Without this change, the runtime is ~1300ms; with the change it’s ~1100ms. That's a saving of **15%**.

I also checked the appearance of the generated HTML hasn’t changed.
@adamchainz adamchainz force-pushed the optimize_sql_panel_1 branch from 6e4830e to 3d77806 Compare January 12, 2022 09:50
Copy link
Member

@matthiask matthiask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I like it.

I think we had reports about pformat being slow in the past, maybe we should skip pformat-ing when PRETTIFY_SQL is False but that would be an additional change/improvement.

Comment on lines -78 to -81
except KeyError:
# This frame doesn't have the expected format, so skip it and move
# on to the next one
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this didn't make sense - list.append cannot raise a KeyError

adamchainz added a commit to adamchainz/django-debug-toolbar that referenced this pull request Jan 12, 2022
Add a few bits of caching:

1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast.
   This saves a lot of processing in N+1 situations.
2. Cache constructed filter stacks in `get_filter_stack()`.
   This avoids recreating all the various sqlparse objects for each query.
3. Pre-compile the simplification regex.
   The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache.

Building on top of django-commons#1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
@matthiask matthiask merged commit f7004fe into django-commons:main Jan 12, 2022
@matthiask
Copy link
Member

Thanks!

@adamchainz adamchainz deleted the optimize_sql_panel_1 branch January 12, 2022 12:45
adamchainz added a commit to adamchainz/django-debug-toolbar that referenced this pull request Jan 12, 2022
Add a few bits of caching:

1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast.
   This saves a lot of processing in N+1 situations.
2. Cache constructed filter stacks in `get_filter_stack()`.
   This avoids recreating all the various sqlparse objects for each query.
3. Pre-compile the simplification regex.
   The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache.

Building on top of django-commons#1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
matthiask pushed a commit that referenced this pull request Jan 12, 2022
Add a few bits of caching:

1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast.
   This saves a lot of processing in N+1 situations.
2. Cache constructed filter stacks in `get_filter_stack()`.
   This avoids recreating all the various sqlparse objects for each query.
3. Pre-compile the simplification regex.
   The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache.

Building on top of #1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
gone pushed a commit to gone/django-debug-toolbar that referenced this pull request Jan 16, 2022
This function is called by the SQL panel to render each frame of each stack trace.

Previously it used a Django template, which is slow. By generating the HTML with pure Python code, we can avoid much of this overhead.

I tried this change on a demo app I built that has 96 total queries, due to N+1 queries. The queries execute in 13ms due to use of SQLite, so the whole view is a negligible concern. Nearly all the view runtime is the toolbar itself. Without this change, the runtime is ~1300ms; with the change it’s ~1100ms. That's a saving of **15%**.

I also checked the appearance of the generated HTML hasn’t changed.
gone pushed a commit to gone/django-debug-toolbar that referenced this pull request Jan 16, 2022
Add a few bits of caching:

1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast.
   This saves a lot of processing in N+1 situations.
2. Cache constructed filter stacks in `get_filter_stack()`.
   This avoids recreating all the various sqlparse objects for each query.
3. Pre-compile the simplification regex.
   The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache.

Building on top of django-commons#1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants