-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache optimized regexp matchers #465
Conversation
Signed-off-by: Marco Pracucci <[email protected]>
d2fda09
to
2f84109
Compare
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like good idea to me.
Signed-off-by: Marco Pracucci <[email protected]>
@pstibrany Thanks for your review. I've replaced the LRU with the v2, and updated the benchmark in the PR description (same results from the benchmark). |
Signed-off-by: Marco Pracucci <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
NewFastRegexMatcher()
is typically very fast but there are some edge cases where parsing the regex may be very expensive. In this PR I propose to add an LRU cache on top ofNewFastRegexMatcher()
to avoid reparsing regex matchers that are frequently used. The size of the LRU is hardcoded to 10k entries.To measure the overhead of the cache, I've created a new benchmark
BenchmarkNewFastRegexMatcher_CacheMisses
which measuresNewFastRegexMatcher()
on cache misses. I personally think the overhead is acceptable, considering the huge impact this cache could have on recurring complex regex matchers.BenchmarkNewFastRegexMatcher
I've modified this benchmark to measure
NewFastRegexMatcher()
both with and without cache. Obviously the comparison withmain
shows am huge CPU reduction, but in practice this benefit will exist only if the regex is frequently used:BenchmarkNewFastRegexMatcher_CacheMisses
To measure the overhead of the cache, I've created a new benchmark which measures
NewFastRegexMatcher()
on cache misses: