-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize long alternate lists #463
Conversation
Further testing: I downloaded a dataset of 713287 real queries from the cell where this issue was discovered (Cell A), and 504496 queries from a shared cell where this issue is not noticeable (Cell B). I wrote a small program to parse the queries and throw away the result. Much of this time is spent reading the queries from disk or running GC, but nonetheless the results are promising:
I'm cautiously optimistic that the small increases in cost don't manifest in real queries. Furthermore, I believe there is still significant improvements to be made by supporting case insensitive matches via the prefix trie. Cell A has 79% of queries making use of |
I would defer it to a follow up PR. |
I just pushed a commit to fix fuzzy test with the fuzzy regexp: 0590a86 |
I was wondering if we really need a trie, or a map is just good enough. So I picked this PR and implemented your proposed optimization just using a map (specifically using This is the
Maybe it's something you want to explore more. Also be aware of |
About this ☝️ : in the meanwhile I was exploring another optimization, but didn't get much benefit so I'm not going to open a PR. However, you may want to look at the changes I did to |
Based on @pracucci's feedback I profiled an implementation with a map and also with a slice / map depending on input size. The results favor using a slice / map, so I've replaced the trie. Full Benchmark Results
|
…e fuzzy regex is invalid Signed-off-by: Marco Pracucci <[email protected]>
Co-authored-by: Marco Pracucci <[email protected]>
828342e
to
297c90f
Compare
Now with this PR and #465 together: I downloaded a dataset of 713287 real queries from the cell where this issue was discovered (Cell A), and 504496 queries from a shared cell where this issue is not noticeable (Cell B). I wrote a small program to parse the queries and throw away the result. Much of this time is spent reading the queries from disk
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing job, I like it. Let's go! 🚀
I was surprised to find out that posting lookups for `foo=~"(bar|bar)"` are faster than `foo=~"bar"`. It turns out we introduced a performance regression in #463. When we added the `optimizeAlternatingLiterals` function, we subtly broke one edge case. A regexp matcher which matches a single literal, like `foo=~"bar"` used to return `bar` from `SetMatches()`, but currently does not. The implication is that the tsdb will first do a LabelValues call to get all values for `foo`, then match them against the regexp `bar`. This PR restores the previous behavior which is able to directly lookup postings for `foo="bar"` instead.
We've seen that very long regex matchers of the form
abc|def|...
consume a large amount of CPU both inNewFastRegexMatcher
and while matching label values. This PR uses a slice or map to optimize both operations as a special case when the regex is one long alternate of literal values. In that special case, the regex parsing is skipped entirely. As an example, this can occur when using Grafana dashboard template variables with theAll
option selected.Unfortunately, this introduces a constant performance cost in
NewFastRegexMatcher
for other regex queries, but It's not clear that this is actually significant in practice. For example, cpu for a simple case:This is a 10% increase, but in concrete terms less than 0.3µs on my laptop.
Here are the results for the specific edge case this PR addresses:
Full Benchmark Results