-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow dynamic replication set of TSDB blocks on store-gateways #10382
base: main
Are you sure you want to change the base?
Conversation
💻 Deploy preview available: https://deploy-preview-mimir-10382-zb444pucvq-vp.a.run.app/docs/mimir/latest/ |
872d7ca
to
a3b09a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what would happen when a block just crosses the 25h boundary. What if store-gateways drop it before the querier realizes that the block is that old (or the querier made the decision for that 1m ago for example)? You pointed out we retry a fixed 3 times. It's not impossible we retry the block on other store-gateways that have dropped it.
Maybe we should fix the retries before merging this PR. Charles suggested retrying as much as the replication factor (recent blocks we retry "NumInstances" times and others only 3 times)
Good point. We could increase the number of retries but I'd rather solve this with some sort of "buffer time" like we do with the upload grace period. Maybe queriers could stop trying to query a block at |
9a2b10e
to
655630e
Compare
func (e *MaxTimeExpandedReplication) EligibleForQuerying(b QueryableReplicatedBlock) bool { | ||
now := e.now() | ||
uploadedDelta := now.Sub(b.GetUploadedAt()) | ||
return uploadedDelta > e.gracePeriod && e.isMaxTimeInWindow(b, now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This grace period is for the time the block was uploaded but I think we also need one for the max time? If a block was uploaded hours ago, we still want it to be synced by the store-gateways before the queriers start expecting store-gateways to have it stop being queried by queriers before store-gateways unload it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed via 450dda1
// To be eligible for querying a block must: | ||
// * Have been uploaded more than `gracePeriod` ago since we need to allow store-gateways | ||
// to sync recently uploaded blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to check the upload grace period here? Won't this be covered by the logic in BlocksConsistency
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that removing this wouldn't cause issues with queries failing or correctness. However, if we expand the replication set without the grace period we'd end up querying more store-gateways than we would otherwise that may not have the block. So by taking the grace period into account we avoid a little extra work for recently uploaded blocks. I can remove this in the interest of keeping the code simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, if we expand the replication set without the grace period we'd end up querying more store-gateways than we would otherwise that may not have the block
after the first iteration of the consistency retries, the consistency tracker would exclude the recently uploaded blocks from missingBlocks
so we won't be querying them anymore because the constructor of BlocksConsistency
excluded them. But we'd still end up querying them once. Maybe what we should do is reassign knownBlocks
after we've created the consistencyTracker
.
That looks somewhat tangential to this though.
3d3ff7c
to
d96f5b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo suggestion about how we handle the grace period
305ff5e
to
48d3d73
Compare
9d725ea
to
5aa26f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one tiny comment on naming, but otherwise, this looks good 🚀
// To be eligible for querying a block must: | ||
// * Have been uploaded more than `gracePeriod` ago since we need to allow store-gateways | ||
// to sync recently uploaded blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, if we expand the replication set without the grace period we'd end up querying more store-gateways than we would otherwise that may not have the block
after the first iteration of the consistency retries, the consistency tracker would exclude the recently uploaded blocks from missingBlocks
so we won't be querying them anymore because the constructor of BlocksConsistency
excluded them. But we'd still end up querying them once. Maybe what we should do is reassign knownBlocks
after we've created the consistencyTracker
.
That looks somewhat tangential to this though.
744614e
to
60a61ce
Compare
This change adds a tenant-specific setting to allow a subset of TSDB blocks to be replicated to more than the configured replication factor (three by default) on store-gateways.
Much like how block uploads have a grace period, add a grace period when querying with expanded replication to make sure store-gateways have enough time to load blocks before they are queried. Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Upload time is checked (and blocks within a grace period are ignored) by the query consistency check. Signed-off-by: Nick Pillitteri <[email protected]>
Apply grace period to sync instead of querying Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
Pulls in grafana/dskit#632 Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Nick Pillitteri <[email protected]>
6a94bff
to
32f88af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
What this PR does
This change adds a setting to allow a subset of TSDB blocks to be replicated to more than the configured replication factor (three by default) on store-gateways.
The
DynamicReplication
interface inpkg/storegateway/dynamic_replication.go
is the core of this change. This new bit of logic determines if a block is eligible to be replicated over more than the default number of store-gateways. It is shared between store-gateways (which must do the replication) and queriers (which need to know which store-gateways to query). It defaults to only doing this for blocks that have samples within the last 25h (configurable). Most of the rest of the changes are using block metadata in places where we previously only used block IDs (ulid.ULID
).Assumptions made in this change:
Which issue(s) this PR fixes or relates to
Fixes #9944
Related grafana/dskit#632
Notes for reviewers
The first commit is the main change to concentrate on which introduces an interface
DynamicReplication
and implementations which are shared between queriers and store-gateways.The second commit is changes to the
mimir-microservices-mode
docker-compose setup to make testing store-gateway changes easier.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.