-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make use of a per-call memory allocator for loading cached chunks #4074
Conversation
Initial testing looks promising (the change is being tested in zone b in the following screenshots). Profiling results:
|
4602368
to
f14724e
Compare
I haven't included any tests for this change because the mock cache backend in combination with a mock allocator would have been more code than the change itself. I can add some if people feel strongly about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, I love it!
@@ -221,7 +244,13 @@ func (cb *CachingBucket) Get(ctx context.Context, name string) (io.ReadCloser, e | |||
contentKey := cachingKeyContent(name) | |||
existsKey := cachingKeyExists(name) | |||
|
|||
hits := cfg.cache.Fetch(ctx, []string{contentKey, existsKey}) | |||
var opts []cache.Option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] This code is duplicated below. You may consider to move it to a function getCacheOptions(ctx)
.
pkg/storegateway/bucket.go
Outdated
@@ -1717,12 +1719,13 @@ func (b *bucketBlock) readIndexRange(ctx context.Context, off, length int64) ([] | |||
return buf.Bytes(), nil | |||
} | |||
|
|||
func (b *bucketBlock) readChunkRange(ctx context.Context, seq int, off, length int64, chunkRanges byteRanges) (*[]byte, error) { | |||
func (b *bucketBlock) readChunkRange(ctx context.Context, seq int, off, length int64, chunkRanges byteRanges, chunkSlabs *pool.SafeSlabPool[byte]) (*[]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] I would name the new param chunksPool
instead of chunkSlabs
to keep consistency with the rest of the code. Really a nit!
5ee98f3
to
f212028
Compare
Code looks good 👍 I was testing some queries with sparse series (queries that touch series, which aren't one-after-the-other in the index or the chunk files, e.g. selecting every third series) in the same cluster as your tests. During that time zone-b had noticeably higher heap than zone-a and zone-c. Do you have an idea what might be causing this, should it be a concern? |
Based on further testing I'm having doubts about this change. I'm going to add some additional instrumentation and see if I can tell what's causing the unexpected heap usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I left a minor comment.
return &getReader{ | ||
c: cfg.cache, | ||
ctx: ctx, | ||
r: reader, | ||
buf: new(bytes.Buffer), | ||
slabs: slabs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to pass the slabs at all here. Reason is that getReader
is used on cache miss, so there will be no memory from the pool. I think in this case we can just release the pool once we exit Get()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point.
Inject a slab pool into the context used by caching BucketReader implementations that allows cache clients to reuse memory for results. Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
3ba9b0d
to
0e05715
Compare
For posterity, after the change to create and free the In the following screenshot: zone a = another experiment, zone b = this change, zone c = control. This change results in a higher working set which is odd but seems to be explained by caching the kernel is doing.
|
There's no value to changing the default and it's possible to introduce subtle performance problems by not using the default. See #4074 (comment) Signed-off-by: Nick Pillitteri <[email protected]>
…ag (#4135) * Deprecate `blocks-storage.bucket-store.chunks-cache.subrange-size` flag There's no value to changing the default and it's possible to introduce subtle performance problems by not using the default. See #4074 (comment) Signed-off-by: Nick Pillitteri <[email protected]> * Update CHANGELOG.md Co-authored-by: Mauro Stettler <[email protected]> --------- Signed-off-by: Nick Pillitteri <[email protected]> Co-authored-by: Marco Pracucci <[email protected]> Co-authored-by: Mauro Stettler <[email protected]>
What this PR does
Inject a slab pool into the context used by caching BucketReader implementations that allows cache clients to reuse memory for results.
Signed-off-by: Nick Pillitteri [email protected]
Which issue(s) this PR fixes or relates to
See #3772
See #3968
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]