-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PLAT-102919] auto delete overlapped blocks + support filter based on compaction level #16
Conversation
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
pkg/compact/compact.go
Outdated
@@ -418,6 +425,7 @@ func NewGroup( | |||
groupGarbageCollectedBlocks prometheus.Counter, | |||
blocksMarkedForDeletion prometheus.Counter, | |||
blocksMarkedForNoCompact prometheus.Counter, | |||
blocksOverlapped prometheus.Counter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks odd that this counter name needs to be repeated 5 times in different places.
pkg/compact/compact.go
Outdated
level.Warn(cg.logger).Log("msg", "found overlapping block in plan that are not the first", | ||
"first", kept.String(), "block", m.String()) | ||
kept = m | ||
} else if m.MinTime < kept.MinTime || m.MaxTime > kept.MaxTime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two disjoint blocks (e.g., [0, 1) and [2, 3)) will trigger this half as well. Should it be?
if kept contains m {
} else if m contains kept {
} else {
halt;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this triggered halting due to partially overlapping blocks:
ts=2024-03-12T05:59:15.143363282Z caller=compact.go:533 level=error name=pantheon-compactor
msg="critical error detected; halting" err="compaction: group 0@9802502949926001568: found partially
overlapped block:
01HQHWVPCPH519JQN3RC2HRM6D (min time: 1708747200000, max time: 1708819200000) vs
01HQF7C1GKTJZSW7B0BWTRS6K0 (min time: 1708732800000, max time: 1708761600000)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes from this group:
01HQF7C1GKTJZSW7B0BWTRS6K0 (min time: 1708732800000, max time: 1708761600000) 01HQD12MFPJ9EYMQGXRFWFHNE3 (min time: 1708740000000, max time: 1708761600000) 01HQHWVPCPH519JQN3RC2HRM6D (min time: 1708747200000, max time: 1708819200000)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the condition if m.MinTime < kept.MinTime || m.MaxTime > kept.MaxTime
includes the disjoint case besides overlapping case.
pkg/compact/compact.go
Outdated
cg.blocksOverlapped.Inc() | ||
if err := os.RemoveAll(filepath.Join(dir, m.ULID.String())); err != nil { | ||
return errors.Wrapf(err, "remove old block dir %s", m.String()) | ||
} | ||
if blockDeletableChecker.CanDelete(cg, m.ULID) { | ||
level.Warn(cg.logger).Log("msg", "deleting overlapping block", "block", m.String(), | ||
"level", m.Compaction.Level, "source", m.Thanos.Source, "labels", m.Thanos.Labels) | ||
return block.Delete(ctx, cg.logger, cg.bkt, m.ULID) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not future-proof. If upstream adds another condition (not included in blockDeletableChecker
) for a block deletion, this deletion logic would be broken. We need to careful when pull in upstream changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, I am thinking to remove the block checker to force deletion here.
Signed-off-by: Yi Jin <[email protected]>
Signed-off-by: Yi Jin <[email protected]>
Currently thanos will halt if it detects overlapped blocks, we had incorrect setup causing blocks ended overlapping with each other and need to clean that up automatically. This might also address this allow overlapping source block for the compaction plan · Issue #5755 · thanos-io/thanos · GitHub
See more in investigation doc: https://docs.google.com/document/d/1VGQUoXI8k1QntMPxiEL4z_Fj9lx5STrLyzof3GqZ298/edit
Adding support for compaction level using
__block_level
in relabel configsChanges
Verification