-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added ability to specify head block format version #9841
Conversation
Signed-off-by: Vladyslav Diachenko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. I just left some minor suggestions.
@@ -77,6 +77,9 @@ const ( | |||
_ | |||
OrderedHeadBlockFmt | |||
UnorderedHeadBlockFmt | |||
|
|||
// head block format that stores chunk format version as well as type of head block (ordered/unordered) | |||
VersionedHeadBlockFmtV1 byte = 128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should go with 5
here since 128
won't let us increment the version anymore. I do not see any problem with going with 5
, so please let me know if I am missing something.
@@ -131,14 +134,21 @@ type block struct { | |||
// This block holds the un-compressed entries. Once it has enough data, this is | |||
// emptied into a block with only compressed entries. | |||
type headBlock struct { | |||
chunkDataFormat byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using the byte
type, what do you think about defining a type called chunkFormat
and using that wherever it is required? It makes it easier to find the references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it possible, the first byte of the head block now specifies the version of the head block, not the type(ordered/unordered).
Conceptually, these are the same thing. Ordered vs Unordered are just the two versions we support (with clearer names than something like HeadV1 vs HeadV2). Even if we do go this new route you suggest, I don't see a reason to start with VersionedHeadBlockFmtV1 = 128
(we could just use the next incremented number and add a second bytes encoded after that).
Can we go about this in a different way?
I think it's much simpler to add a new head format version (UnorderedWithMetadataFmt
) that will (1) accept unordered entries and (2) encode non-indexed metadata. It's not intended that we'll need to mix-and-match different headblock vs chunk formats. Asides the issue of some of this being conceptually impossible (we can't have a head block accept non-indexed-metadata but then write to an older chunk format that doens't support it), it's intended that loki starts writing new versions while being able to read old versions and not the other way around.
In summary, I think it's clearer and simpler to add a new head format that accepts unordered entries with optional non-indexed-metadata: UnorderedWithMetadataFmt
. WDYT?
Hey @owen-d . yes, if we are going to remove/deprecate ordered writes, we can just add a new format of a head like Also, even with the current implementation, it does not stop us from removing/deprecating ordered writes in the future. So, I would go with my solution, and just change the value of WDYT @owen-d , @sandeepsukhani , @salvacorts ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of the route we chose, the main thing we're going to need to implement is the ability to write entries with non-indexed-metadata in both the unordered head block as well as in the blocks. If we choose to do this with the old ordered-only head as well instead of deprecating+removing ordered-only writes, that's fine too :).
The problem with using your (head_version, block_version)
tuple is that it forces us to support the matrix of all head_versions
and all block_versions
in the future. In reality, we may want to add a new head_version
or block_version
without needing to make it compatible with all formats of the other.
Does that make sense?
I think we should use the explicit variant I've suggested and quickly deprecate the ordered-only ingestion variants as a followup
@@ -369,10 +376,10 @@ func (hb *unorderedHeadBlock) Serialise(pool WriterPool) ([]byte, error) { | |||
} | |||
|
|||
func (hb *unorderedHeadBlock) Convert(version HeadBlockFmt) (HeadBlock, error) { | |||
if version > OrderedHeadBlockFmt { | |||
if version == UnorderedHeadBlockFmt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for changing this, I'm not sure what I was thinking :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you wanted to make this logic flexible to use unorderedHeadBlock for all the versions after OrderedHeadBlockFmt ;) like 4, 5,6,7,etc...
We default to enabling unordered writes and it is highly unlikely someone would go out of the way and disable unordered writes. Less code is always better so I think it would be better to keep it simple i.e always create unordered blocks and use chunk format version going forward. |
closed in favor of 529ad9d |
What this PR does / why we need it:
Currently, the first byte of head block data identifies the type of the head block(ordered/unordered). However, using this approach, it's not possible to distinguish the version of the data that is stored in this head block.
In the next version of the chunk format (V4), we add additional data and when we deserialize the data, it's necessary to know the version of the data format to use different deserialization strategies.
To make it possible, the first byte of the head block now specifies the version of the head block, not the type(ordered/unordered).
Special notes for your reviewer:
VersionedHeadBlockFmtV1 byte = 128
- as long as previously the first byte of the head block was the chunk format version, I decided to use 128 as a baseline for head block format versioning. it allows to avoid the collision with chunk formats v1,v2 if somebody decides to upgrade to the latest version of Loki from some legacy version.Checklist
CONTRIBUTING.md
guide (required)CHANGELOG.md
updatedadd-to-release-notes
labeldocs/sources/upgrading/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PR