added ability to specify head block format version #9841

vlad-diachenko · 2023-07-01T04:08:47Z

What this PR does / why we need it:

Currently, the first byte of head block data identifies the type of the head block(ordered/unordered). However, using this approach, it's not possible to distinguish the version of the data that is stored in this head block.
In the next version of the chunk format (V4), we add additional data and when we deserialize the data, it's necessary to know the version of the data format to use different deserialization strategies.

To make it possible, the first byte of the head block now specifies the version of the head block, not the type(ordered/unordered).

Special notes for your reviewer:

This PR does not enable the new format of the head block format and it will be enabled once chunk format V4 is added in this PR.
VersionedHeadBlockFmtV1 byte = 128 - as long as previously the first byte of the head block was the chunk format version, I decided to use 128 as a baseline for head block format versioning. it allows to avoid the collision with chunk formats v1,v2 if somebody decides to upgrade to the latest version of Loki from some legacy version.

Checklist

Reviewed the CONTRIBUTING.md guide (required)
Documentation added
Tests updated
CHANGELOG.md updated
- If the change is worth mentioning in the release notes, add add-to-release-notes label
Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md
For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR

Signed-off-by: Vladyslav Diachenko <[email protected]>

sandeepsukhani

The changes look good to me. I just left some minor suggestions.

sandeepsukhani · 2023-07-05T06:19:53Z

pkg/chunkenc/memchunk.go

@@ -77,6 +77,9 @@ const (
 	_
 	OrderedHeadBlockFmt
 	UnorderedHeadBlockFmt
+
+	// head block format that stores chunk format version as well as type of head block (ordered/unordered)
+	VersionedHeadBlockFmtV1 byte = 128


I think we should go with 5 here since 128 won't let us increment the version anymore. I do not see any problem with going with 5, so please let me know if I am missing something.

sandeepsukhani · 2023-07-05T07:15:33Z

pkg/chunkenc/memchunk.go

@@ -131,14 +134,21 @@ type block struct {
 // This block holds the un-compressed entries. Once it has enough data, this is
 // emptied into a block with only compressed entries.
 type headBlock struct {
+	chunkDataFormat byte


Instead of using the byte type, what do you think about defining a type called chunkFormat and using that wherever it is required? It makes it easier to find the references.

owen-d

To make it possible, the first byte of the head block now specifies the version of the head block, not the type(ordered/unordered).

Conceptually, these are the same thing. Ordered vs Unordered are just the two versions we support (with clearer names than something like HeadV1 vs HeadV2). Even if we do go this new route you suggest, I don't see a reason to start with VersionedHeadBlockFmtV1 = 128 (we could just use the next incremented number and add a second bytes encoded after that).

Can we go about this in a different way?

I think it's much simpler to add a new head format version (UnorderedWithMetadataFmt) that will (1) accept unordered entries and (2) encode non-indexed metadata. It's not intended that we'll need to mix-and-match different headblock vs chunk formats. Asides the issue of some of this being conceptually impossible (we can't have a head block accept non-indexed-metadata but then write to an older chunk format that doens't support it), it's intended that loki starts writing new versions while being able to read old versions and not the other way around.

In summary, I think it's clearer and simpler to add a new head format that accepts unordered entries with optional non-indexed-metadata: UnorderedWithMetadataFmt. WDYT?

vlad-diachenko · 2023-07-06T11:24:21Z

To make it possible, the first byte of the head block now specifies the version of the head block, not the type(ordered/unordered).

Conceptually, these are the same thing. Ordered vs Unordered are just the two versions we support (with clearer names than something like HeadV1 vs HeadV2). Even if we do go this new route you suggest, I don't see a reason to start with VersionedHeadBlockFmtV1 = 128 (we could just use the next incremented number and add a second bytes encoded after that).

Can we go about this in a different way?

I think it's much simpler to add a new head format version (UnorderedWithMetadataFmt) that will (1) accept unordered entries and (2) encode non-indexed metadata. It's not intended that we'll need to mix-and-match different headblock vs chunk formats. Asides the issue of some of this being conceptually impossible (we can't have a head block accept non-indexed-metadata but then write to an older chunk format that doens't support it), it's intended that loki starts writing new versions while being able to read old versions and not the other way around.

In summary, I think it's clearer and simpler to add a new head format that accepts unordered entries with optional non-indexed-metadata: UnorderedWithMetadataFmt. WDYT?

Hey @owen-d . yes, if we are going to remove/deprecate ordered writes, we can just add a new format of a head like UnorderedWithMetadataFmt. However, currently, each user can disable/enable unordered-writes and Loki decides what format of head block to use: OrderedHeadBlockFmt, UnorderedHeadBlockFmt. So, with the current implementation of Loki, it's necessary to add 2 new formats OrderedWithMetadataFmt and UnorderedWithMetadataFmt because some users can have unordered-writes disabled but still want to use non-indexed metadata.

Also, even with the current implementation, it does not stop us from removing/deprecating ordered writes in the future.

So, I would go with my solution, and just change the value of VersionedHeadBlockFmtV1 to 5, so it would allow us to add new versions without any issues in the future and add as many options to head block as we want...

WDYT @owen-d , @sandeepsukhani , @salvacorts ?

owen-d

Regardless of the route we chose, the main thing we're going to need to implement is the ability to write entries with non-indexed-metadata in both the unordered head block as well as in the blocks. If we choose to do this with the old ordered-only head as well instead of deprecating+removing ordered-only writes, that's fine too :).

The problem with using your (head_version, block_version) tuple is that it forces us to support the matrix of all head_versions and all block_versions in the future. In reality, we may want to add a new head_version or block_version without needing to make it compatible with all formats of the other.

Does that make sense?

I think we should use the explicit variant I've suggested and quickly deprecate the ordered-only ingestion variants as a followup

owen-d · 2023-07-06T14:20:18Z

pkg/chunkenc/unordered.go

@@ -369,10 +376,10 @@ func (hb *unorderedHeadBlock) Serialise(pool WriterPool) ([]byte, error) {
 }

 func (hb *unorderedHeadBlock) Convert(version HeadBlockFmt) (HeadBlock, error) {
-	if version > OrderedHeadBlockFmt {
+	if version == UnorderedHeadBlockFmt {


thanks for changing this, I'm not sure what I was thinking :)

I believe you wanted to make this logic flexible to use unorderedHeadBlock for all the versions after OrderedHeadBlockFmt ;) like 4, 5,6,7,etc...

sandeepsukhani · 2023-07-06T15:23:27Z

However, currently, each user can disable/enable unordered-writes and Loki decides what format of head block to use: OrderedHeadBlockFmt, UnorderedHeadBlockFmt.

We default to enabling unordered writes and it is highly unlikely someone would go out of the way and disable unordered writes. Less code is always better so I think it would be better to keep it simple i.e always create unordered blocks and use chunk format version going forward.

vlad-diachenko · 2023-07-10T15:50:02Z

closed in favor of 529ad9d

added ability to specify head block format version

e1464bf

Signed-off-by: Vladyslav Diachenko <[email protected]>

vlad-diachenko requested a review from a team as a code owner July 1, 2023 04:08

pull-request-size bot added the size/L label Jul 1, 2023

sandeepsukhani reviewed Jul 5, 2023

View reviewed changes

owen-d reviewed Jul 5, 2023

View reviewed changes

owen-d reviewed Jul 6, 2023

View reviewed changes

vlad-diachenko closed this Jul 10, 2023

chaudum deleted the vlad.diachenko/head_block_versioning branch December 20, 2024 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added ability to specify head block format version #9841

added ability to specify head block format version #9841

vlad-diachenko commented Jul 1, 2023 •

edited

Loading

sandeepsukhani left a comment

sandeepsukhani Jul 5, 2023

sandeepsukhani Jul 5, 2023

owen-d left a comment

vlad-diachenko commented Jul 6, 2023 •

edited

Loading

Can we go about this in a different way?

owen-d left a comment •

edited

Loading

owen-d Jul 6, 2023

vlad-diachenko Jul 6, 2023

sandeepsukhani commented Jul 6, 2023

vlad-diachenko commented Jul 10, 2023

added ability to specify head block format version #9841

added ability to specify head block format version #9841

Conversation

vlad-diachenko commented Jul 1, 2023 • edited Loading

What this PR does / why we need it:

Special notes for your reviewer:

Checklist

sandeepsukhani left a comment

Choose a reason for hiding this comment

sandeepsukhani Jul 5, 2023

Choose a reason for hiding this comment

sandeepsukhani Jul 5, 2023

Choose a reason for hiding this comment

owen-d left a comment

Choose a reason for hiding this comment

Can we go about this in a different way?

vlad-diachenko commented Jul 6, 2023 • edited Loading

Can we go about this in a different way?

owen-d left a comment • edited Loading

Choose a reason for hiding this comment

owen-d Jul 6, 2023

Choose a reason for hiding this comment

vlad-diachenko Jul 6, 2023

Choose a reason for hiding this comment

sandeepsukhani commented Jul 6, 2023

vlad-diachenko commented Jul 10, 2023

vlad-diachenko commented Jul 1, 2023 •

edited

Loading

vlad-diachenko commented Jul 6, 2023 •

edited

Loading

owen-d left a comment •

edited

Loading