-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block-builder: Add an option to consume a partition completely till cycle end offset with no partially region #10506
Conversation
…ilder Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
bucketDir := path.Join(cfg.BlocksStorage.Bucket.Filesystem.Directory, "1") | ||
db, err := tsdb.Open(bucketDir, promslog.NewNopLogger(), nil, nil, nil) | ||
require.NoError(t, err) | ||
t.Cleanup(func() { require.NoError(t, db.Close()) }) | ||
compareQuery(t, | ||
db, | ||
producedSamples, | ||
nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, you had mentioned this, but I didn't quite see the blatant duplication. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM%comment!
// We will process this sample in the next cycle. | ||
allSamplesProcessed = false | ||
continue | ||
} | ||
if recordAlreadyProcessed && s.TimestampMs < lastBlockMax { | ||
if !processEverything && recordAlreadyProcessed && s.TimestampMs < lastBlockMax { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it the first time we restart blockbuilders with this flag enabled, we will produce a small amount of duplicate records. (records that had recordAlreadyProcessed
from the previous cycle, but we're no longer skipping past those.) I do not see an avenue where we'd skip any records on the first run. Am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, migrating from old the new way will create some duplicate samples in blocks which is harmless since compactor will merge them together.
…ycle end offset with no partially region (#10506) (#10514) * Add an option to not leave any partially consumed region for block builder Signed-off-by: Ganesh Vernekar <[email protected]> * Add TestNoPartiallyConsumedRegions Signed-off-by: Ganesh Vernekar <[email protected]> * Fix imports Signed-off-by: Ganesh Vernekar <[email protected]> --------- Signed-off-by: Ganesh Vernekar <[email protected]> (cherry picked from commit 5ff8293) Co-authored-by: Ganesh Vernekar <[email protected]>
What this PR does
Currently we have "last seen" and "commit" points, and the records between can be partially consumed. This PR adds an experimental
block-builder.no-partially-consumed-region
where we do not have this partially consumed region, instead entire records until cycle end time goes inside a block.When flag is enabled, we also reduce the buffer time from 15 mins to 5 mins. Which means, the consumption will technically be from
X:05
to(X+1):05
. The extra 5 mins is to allow for any remote write delay. Hard coded for now for experiments.This flag is an intermediate state, meaning after running tests with this option, we will only keep one of them and get rid of the flag. The wish is to keep this new version since it makes scheduler design and parallelisation easier. If this one wins, we can also get rid of "last seen offset" from the commit information.
Checklist
[ ] Documentation added.[ ]CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.