Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rowcontainer: address an old TODO #92859

Merged
merged 1 commit into from
Dec 2, 2022

Conversation

yuzefovich
Copy link
Member

This commit addresses an old TODO about figuring out why we cannot reuse the same buffer in diskRowIterator.Row method. The contract of that method was previously confusing - it says that the call to Row invalidates the row returned on the previous call; however, the important piece was missing - that the datums themselves cannot be mutated (this is what we assume elsewhere and perform only the "shallow" copies). This commit clarifies the contract of the method and explicitly explains why we need to allocate fresh byte slices (which is done via ByteAllocator to reduce the number of allocations).

Additional context can be found in #43145 which added this copy in the first place. Here is a relevant quote from Alfonso:

I think what's going on here is that this type of contract (you may only
reuse the row until the next call) is a bit unclear. `CopyRow`s of
`EncDatum`s are only implemented as shallow copies, so the reference to
this reuse only applies to the `EncDatumRow`, but not to the `encoded`
field, which is what is rewritten in this case. The `encoded` field will
not be copied, so I think the tests are probably failing due to that.
This is unfortunate and there doesn't seem to be a good reason for it.
To implement deep copies, we will have to implement deep copies for
`Datum`s as well.

I think we were under the impression that we'd implement the "deep copy" in CopyRow, but I highly doubt we'll do so given that we're mostly investing in the columnar infrastructure nowadays, and the current way of performing shallow copying has worked well long enough.

Epic: None

Release note: None

@yuzefovich yuzefovich requested review from DrewKimball, cucaroach and a team December 1, 2022 21:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/rowcontainer/disk_row_container.go line 468 at r1 (raw file):

	k := r.UnsafeKey()
	v := r.UnsafeValue()
	// keyValToRow will use the encoded key and value bytes as is by showing

[nit] showing->shoving

This commit addresses an old TODO about figuring out why we cannot reuse
the same buffer in `diskRowIterator.Row` method. The contract of that
method was previously confusing - it says that the call to `Row`
invalidates the row returned on the previous call; however, the
important piece was missing - that the datums themselves cannot be
mutated (this is what we assume elsewhere and perform only the "shallow"
copies). This commit clarifies the contract of the method and explicitly
explains why we need to allocate fresh byte slices (which is done via
`ByteAllocator` to reduce the number of allocations).

Additional context can be found in cockroachdb#43145 which added this copy in the
first place. Here is a relevant quote from Alfonso:
```
I think what's going on here is that this type of contract (you may only
reuse the row until the next call) is a bit unclear. `CopyRow`s of
`EncDatum`s are only implemented as shallow copies, so the reference to
this reuse only applies to the `EncDatumRow`, but not to the `encoded`
field, which is what is rewritten in this case. The `encoded` field will
not be copied, so I think the tests are probably failing due to that.
This is unfortunate and there doesn't seem to be a good reason for it.
To implement deep copies, we will have to implement deep copies for
`Datum`s as well.
```
I think we were under the impression that we'd implement the "deep copy"
in `CopyRow`, but I highly doubt we'll do so given that we're mostly
investing in the columnar infrastructure nowadays, and the current way
of performing shallow copying has worked well long enough.

Epic: None

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @cucaroach and @DrewKimball)


pkg/sql/rowcontainer/disk_row_container.go line 468 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] showing->shoving

Done.

@craig
Copy link
Contributor

craig bot commented Dec 1, 2022

🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

@yuzefovich
Copy link
Member Author

Pinged cla.

bors r+

@craig
Copy link
Contributor

craig bot commented Dec 1, 2022

Stopped waiting for PR status (Github check) without running due to duplicate requests to run. You may check Bors to see that this PR is included in a batch by one of the other requests.

@craig
Copy link
Contributor

craig bot commented Dec 2, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Dec 2, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants