-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compact: native histogram support for downsampling #6350
base: main
Are you sure you want to change the base?
compact: native histogram support for downsampling #6350
Conversation
161bfb0
to
669417a
Compare
dc62091
to
f049b1c
Compare
ed2aa80
to
bdf8e80
Compare
bdf8e80
to
fd82b1a
Compare
pkg/compact/downsample/downsample.go
Outdated
} | ||
|
||
if h.total > 0 { | ||
if fh.DetectReset(h.previous) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The increase of schema is treated like a reset:
A decrease of the ZeroThreshold or an increase of the Schema (i.e. an increase of resolution) can only happen together with a reset. Thus, the method returns true in either case.
If we do fh.CopyToSchema(h.schema)
previously then this case is not caught AFAICT. Shouldn't we do the same reset logic in the preceding if
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to check that, but not sure if this will be still relevant. I proposed a change to how we can do downsampling: #6554
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added downsampling support for mixed chunk series and this PR is still open for review since #6554 will be a longer term task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do
fh.CopyToSchema(h.schema)
previously then this case is not caught AFAICT. Shouldn't we do the same reset logic in the precedingif
?
I think you are right and will look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do
fh.CopyToSchema(h.schema)
previously then this case is not caught AFAICT. Shouldn't we do the same reset logic in the precedingif
?
h.schema
is the precalculated minimal schema from all samples of the series, which we need to know, otherwise we might run in the situation where we can't add a sample to aggregates. For the reset we need to compare current and previous histogram, so I suggest to keep the original histogram and use that for reset detection: d7d2224
From what I understand the head appender sets reset hints properly, so even without the fix it should work.
619d806
to
0415cc4
Compare
3b58955
to
97a8dd6
Compare
55ae8d5
to
c0cfcc6
Compare
Signed-off-by: Sebastian Rabenhorst <[email protected]> Fixed imports Signed-off-by: Sebastian Rabenhorst <[email protected]> Added native histogram support for downsampling Signed-off-by: Sebastian Rabenhorst <[email protected]> Improved comment Signed-off-by: Sebastian Rabenhorst <[email protected]> Removed unnecessary if Signed-off-by: Sebastian Rabenhorst <[email protected]> Fixed native histogram proto conversion in remote engine Signed-off-by: Sebastian Rabenhorst <[email protected]> Added downsampling support for native histograms Reverted files from base PR Added changelog Signed-off-by: Sebastian Rabenhorst <[email protected]> Reverted error in bucket.go Signed-off-by: Sebastian Rabenhorst <[email protected]> trigger tests Signed-off-by: Sebastian Rabenhorst <[email protected]>
Signed-off-by: Sebastian Rabenhorst <[email protected]> Fixed tests, removed unused code Signed-off-by: Sebastian Rabenhorst <[email protected]> Fixed tests, removed unused code Signed-off-by: Sebastian Rabenhorst <[email protected]>
Signed-off-by: Sebastian Rabenhorst <[email protected]>
Signed-off-by: Sebastian Rabenhorst <[email protected]>
Signed-off-by: Sebastian Rabenhorst <[email protected]> Improved mixed type chunks and native histogram downsampling tests Signed-off-by: Sebastian Rabenhorst <[email protected]>
Signed-off-by: Sebastian Rabenhorst <[email protected]>
Signed-off-by: Sebastian Rabenhorst <[email protected]>
c0cfcc6
to
c402aaa
Compare
Signed-off-by: Sebastian Rabenhorst <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We get this feedback consistently that Thanos is hard to understand and we are adding quite new functionality here thus could you please add a paragraph to docs/components/compact.md
about how native histogram downsampling works?
@@ -85,6 +85,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re | |||
- [#6163](https://github.com/thanos-io/thanos/pull/6163) Receiver: changed default max backoff from 30s to 5s for forwarding requests. Can be configured with `--receive-forward-max-backoff`. | |||
- [#6327](https://github.com/thanos-io/thanos/pull/6327) *: *breaking :warning:* Use histograms instead of summaries for instrumented handlers. | |||
- [#6322](https://github.com/thanos-io/thanos/pull/6322) Logging: Avoid expensive log.Valuer evaluation for disallowed levels. | |||
- [#6350] https://github.com/thanos-io/thanos/pull/6350 Compact: Add native histogram support for downsampling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [#6350] https://github.com/thanos-io/thanos/pull/6350 Compact: Add native histogram support for downsampling. | |
- [#6350](https://github.com/thanos-io/thanos/pull/6350) Compact: Add native histogram support for downsampling. |
@@ -53,6 +53,7 @@ type HeadGenOptions struct { | |||
ScrapeInterval time.Duration | |||
|
|||
WithWAL bool | |||
AppendLabels labels.Labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
// Native histograms only have aggregates for COUNT, COUNTER and SUM. | ||
aggrsTypes := []storepb.Aggr{storepb.Aggr_COUNT, storepb.Aggr_COUNTER, storepb.Aggr_SUM} | ||
for _, concurrency := range []int{1, 2, 4, 8, 16, 32} { | ||
b.Run(fmt.Sprintf("aggregates: %v, concurrency: %d", aggrs, concurrency), func(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably meant to use aggrsTypes
here? aggrs
is empty 😄
@@ -179,23 +192,44 @@ func Downsample( | |||
} | |||
} | |||
} | |||
|
|||
} | |||
if i > 0 && previousIsHistogram != isHistogram(ac) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe downsampleAggr could handle this interleaving of different chunk types? For example, we could loop through the given data and perform downsampling individually for different sets of chunk types. It's getting hard to understand this loop. The same could probably be said about DownsampleRaw but there are fewer if
branches there so it's understandable.
} | ||
|
||
if h.sum == nil { | ||
if h.resetDetected { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to reset this to false
?
} else { | ||
h.sum = fh.Copy() | ||
} | ||
// This needs to be h gauge histogram, otherwise reset detection will be triggered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This needs to be h gauge histogram, otherwise reset detection will be triggered | |
// This needs to be a gauge histogram, otherwise reset detection will be triggered |
} | ||
// This needs to be h gauge histogram, otherwise reset detection will be triggered | ||
// when appending the aggregated chunk and histogram.count < appender.count. | ||
h.sum.CounterResetHint = histogram.GaugeType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the result that this is always set when adding multiple histograms? Maybe we can move this out of the if?
b.chunks[AggrCount] = chunkenc.NewXORChunk() | ||
|
||
// Sum aggregate always needs to be gauge type, otherwise | ||
// append can fail when sum_window_n > sum_window_n+1 (.i.e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// append can fail when sum_window_n > sum_window_n+1 (.i.e | |
// append can fail when sum_window_n > sum_window_n+1 (i.e. |
switch fh.CounterResetHint { | ||
case histogram.GaugeType: | ||
if app != nil { | ||
var () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var () |
Agreed. I will ad documentation about it. |
Any update on this? Would be cool to merge it. |
This seems to be the last piece missing to fully support native histograms in Thanos. |
You can open up a separate PR using commits from this branch and then add extra fixes on top |
I don't had/have the time continue working on this. As @GiedriusS wrote, feel free to jut copy my branch. If you have question you can put them here or find me in CNCF slack. |
This PR adds native histogram support for downsampling in compact and is part of #5907.
Changes
Verification