Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

block-builder: Add an option to consume a partition completely till cycle end offset with no partially region #10506

Merged
merged 3 commits into from
Jan 24, 2025

Conversation

codesome
Copy link
Member

@codesome codesome commented Jan 23, 2025

What this PR does

Currently we have "last seen" and "commit" points, and the records between can be partially consumed. This PR adds an experimental block-builder.no-partially-consumed-region where we do not have this partially consumed region, instead entire records until cycle end time goes inside a block.

When flag is enabled, we also reduce the buffer time from 15 mins to 5 mins. Which means, the consumption will technically be from X:05 to (X+1):05. The extra 5 mins is to allow for any remote write delay. Hard coded for now for experiments.

This flag is an intermediate state, meaning after running tests with this option, we will only keep one of them and get rid of the flag. The wish is to keep this new version since it makes scheduler design and parallelisation easier. If this one wins, we can also get rid of "last seen offset" from the commit information.

Checklist

  • Tests updated.
  • [ ] Documentation added.
  • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@codesome codesome requested a review from seizethedave January 23, 2025 21:16
@codesome codesome requested a review from a team as a code owner January 23, 2025 21:16
@codesome codesome changed the title block-builder: Add an option to consume a partition completely till cycle end offset with no partial region block-builder: Add an option to consume a partition completely till cycle end offset with no partially region Jan 23, 2025
Signed-off-by: Ganesh Vernekar <[email protected]>
Comment on lines -892 to -899
bucketDir := path.Join(cfg.BlocksStorage.Bucket.Filesystem.Directory, "1")
db, err := tsdb.Open(bucketDir, promslog.NewNopLogger(), nil, nil, nil)
require.NoError(t, err)
t.Cleanup(func() { require.NoError(t, db.Close()) })
compareQuery(t,
db,
producedSamples,
nil,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, you had mentioned this, but I didn't quite see the blatant duplication. 👍

Copy link
Contributor

@seizethedave seizethedave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM%comment!

// We will process this sample in the next cycle.
allSamplesProcessed = false
continue
}
if recordAlreadyProcessed && s.TimestampMs < lastBlockMax {
if !processEverything && recordAlreadyProcessed && s.TimestampMs < lastBlockMax {
Copy link
Contributor

@seizethedave seizethedave Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it the first time we restart blockbuilders with this flag enabled, we will produce a small amount of duplicate records. (records that had recordAlreadyProcessed from the previous cycle, but we're no longer skipping past those.) I do not see an avenue where we'd skip any records on the first run. Am I right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, migrating from old the new way will create some duplicate samples in blocks which is harmless since compactor will merge them together.

@codesome codesome merged commit 5ff8293 into main Jan 24, 2025
28 checks passed
@codesome codesome deleted the codesome/last-not-seen branch January 24, 2025 17:35
codesome added a commit that referenced this pull request Jan 24, 2025
…ycle end offset with no partially region (#10506) (#10514)

* Add an option to not leave any partially consumed region for block builder

Signed-off-by: Ganesh Vernekar <[email protected]>

* Add TestNoPartiallyConsumedRegions

Signed-off-by: Ganesh Vernekar <[email protected]>

* Fix imports

Signed-off-by: Ganesh Vernekar <[email protected]>

---------

Signed-off-by: Ganesh Vernekar <[email protected]>
(cherry picked from commit 5ff8293)

Co-authored-by: Ganesh Vernekar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants