Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #465 I've introduced an in-memory cache for
FastRegexMatcher
. Unfortunately we learned the hard way that a cache size limit based on the number of entries may practically cause unbounded memory utilization, based on the actual complexity of the cached regexps. Moreover, since there's no TTL, if a tenant runs many queries with complex regexps and then doesn't use regexps anymore (or they're used at a low rate), these cached entries will be never removed from the cache.To solve these problems, in this PR I propose to:
To do it, I've introduced two more dependencies:
github.com/DmitriyVTitov/size
to compute the actual size ofFastRegexMatcher
github.com/dgraph-io/ristretto
for the cache (instead ofgithub.jparrowsec.cn/hashicorp/golang-lru/v2
). I've picked ristretto because we already use such cache in GEM, so we have some experience with itBenchmark
Ristretto is more complex (and so expensive) than the lightweight LRU I was using before, so the % CPU increase showed in the benchmark looks huge, but if you look at the actual time spent (ns) I think it's still acceptable.