-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize render_stacktrace() #1571
Optimize render_stacktrace() #1571
Conversation
4289268
to
6e4830e
Compare
This function is called by the SQL panel to render each frame of each stack trace. Previously it used a Django template, which is slow. By generating the HTML with pure Python code, we can avoid much of this overhead. I tried this change on a demo app I built that has 96 total queries, due to N+1 queries. The queries execute in 13ms due to use of SQLite, so the whole view is a negligible concern. Nearly all the view runtime is the toolbar itself. Without this change, the runtime is ~1300ms; with the change it’s ~1100ms. That's a saving of **15%**. I also checked the appearance of the generated HTML hasn’t changed.
6e4830e
to
3d77806
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I like it.
I think we had reports about pformat
being slow in the past, maybe we should skip pformat
-ing when PRETTIFY_SQL
is False
but that would be an additional change/improvement.
except KeyError: | ||
# This frame doesn't have the expected format, so skip it and move | ||
# on to the next one | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this didn't make sense - list.append
cannot raise a KeyError
Add a few bits of caching: 1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast. This saves a lot of processing in N+1 situations. 2. Cache constructed filter stacks in `get_filter_stack()`. This avoids recreating all the various sqlparse objects for each query. 3. Pre-compile the simplification regex. The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache. Building on top of django-commons#1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
Thanks! |
Add a few bits of caching: 1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast. This saves a lot of processing in N+1 situations. 2. Cache constructed filter stacks in `get_filter_stack()`. This avoids recreating all the various sqlparse objects for each query. 3. Pre-compile the simplification regex. The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache. Building on top of django-commons#1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
Add a few bits of caching: 1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast. This saves a lot of processing in N+1 situations. 2. Cache constructed filter stacks in `get_filter_stack()`. This avoids recreating all the various sqlparse objects for each query. 3. Pre-compile the simplification regex. The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache. Building on top of #1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
This function is called by the SQL panel to render each frame of each stack trace. Previously it used a Django template, which is slow. By generating the HTML with pure Python code, we can avoid much of this overhead. I tried this change on a demo app I built that has 96 total queries, due to N+1 queries. The queries execute in 13ms due to use of SQLite, so the whole view is a negligible concern. Nearly all the view runtime is the toolbar itself. Without this change, the runtime is ~1300ms; with the change it’s ~1100ms. That's a saving of **15%**. I also checked the appearance of the generated HTML hasn’t changed.
Add a few bits of caching: 1. Add sub-function to `parse_sql` with `lru_cache`, so that repeat calls with the same query are fast. This saves a lot of processing in N+1 situations. 2. Cache constructed filter stacks in `get_filter_stack()`. This avoids recreating all the various sqlparse objects for each query. 3. Pre-compile the simplification regex. The `re` module already uses an internal LRU cache of regexes, but this avoids recompiling if the regex ever drops out of that cache. Building on top of django-commons#1571, this takes run time for the same tested view from ~1100ms to ~950ms, another ~15% saving.
This function is called by the SQL panel to render each frame of each stack trace.
Previously it used a Django template, which is slow. By generating the HTML with pure Python code, we can avoid much of this overhead.
I tried this change on a demo app I built that has 96 total queries, due to N+1 queries. The queries execute in 13ms due to use of SQLite, so the whole view is a negligible concern. Nearly all the view runtime is the toolbar itself. Without this change, the runtime is ~1300ms; with the change it’s ~1100ms. That's a saving of 15%.
I also checked the appearance of the generated HTML hasn’t changed.