-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bucket-rewrite: the new and old blocks are combined by the compactor to create another block #4550
Comments
Yes, I have seen the same problem before. For the solution you proposed, do you want to add the original sources to the new block as well? In this case, how do you deal with duplicated compaction sources? https://github.com/thanos-io/thanos/blob/main/pkg/compact/compact.go#L733 I might be wrong though.
What about adding a new block metadata filter to filter the original block? As the original block ID is present in the new block metadata, we can remove the original block to avoid compacting it. Another solution might be to add both a deletion marker and a no compact marker to the original block, this seems like the easiest way. WDYT? |
I was experimenting with setting the source of the new block to the original block ID + the original block sources. I also added the original block to the list of parent blocks (Not sure if this makes a difference). My thought process behind it was to make it similar to how compaction creates the meta.json file. Since after it creates the newly compacted block, it doesn't combine the new block with the smaller source blocks because the smaller blocks are part of the sources of the new block. I don't know if this is the best approach to solving this problem and it could confuse someone looking at the new meta.json file. So if the original block is
One case that could potentially arise is that if a new block was recently created through regular compaction (let's call it
I think this makes sense. If the previous scenario described is also valid, then we would have to filter out all the original source blocks as well which are currently present in the bucket rewrite field inside of the meta.json file. |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Closing for now as promised, let us know if you need this to be reopened! 🤗 |
1 similar comment
Closing for now as promised, let us know if you need this to be reopened! 🤗 |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Closing for now as promised, let us know if you need this to be reopened! 🤗 |
Thanos, Prometheus and Golang version used:
What happened:
When running the bucket rewrite tool, it creates a new block, and the old one is marked for deletion.
The old block is not deleted right away, as it waits for the deletion delay to pass. However, while it is still marked for deletion, the compactor can still use it for compaction because it ignores the deletion marker for a period of time as explained here:
https://github.com/thanos-io/thanos/blob/aa148f8fdb28/cmd/thanos/compact.go#L224-L227.
When the compactor runs, and picks up the new and deleted block, it see's that there exists two blocks with overlapping time intervals and proceeds to combine them. Once combined, it creates a new block that has the old deleted data in it. The block created by the bucket rewrite tool is marked for deletion.
What you expected to happen:
That the new and old block are not merged together to create another block.
I think this occurs because the source blocks inside the
meta.json
file for the block are different. The original block has all of its source blocks, while the new block has only itself as the source blocks.I think a possible fix for this is to add the source blocks inside the
meta.json
of the new block. I believe it would need to be the source blocks of the rewritten block + the rewritten block itself. This way the compactor will handle the two blocks similar to how it handles not combining compacted blocks and their source blocks. There could be other ways to handle this as well. There was some discussion in Cortex to not compact any block marked for deletion at all. This would fix this issue I believe. Seen here: cortexproject/cortex#4328.How to reproduce it (as minimally and precisely as possible):
Full logs to relevant components:
Anything else we need to know:
I am running the bucket rewrite logic from inside of Cortex. I suspect that this issue is the same as it would be in Thanos, because the compactor is run inside of the Thanos code which calculates the overlapping blocks.
The text was updated successfully, but these errors were encountered: