-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rowcontainer: address an old TODO #92859
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/rowcontainer/disk_row_container.go
line 468 at r1 (raw file):
k := r.UnsafeKey() v := r.UnsafeValue() // keyValToRow will use the encoded key and value bytes as is by showing
[nit] showing
->shoving
This commit addresses an old TODO about figuring out why we cannot reuse the same buffer in `diskRowIterator.Row` method. The contract of that method was previously confusing - it says that the call to `Row` invalidates the row returned on the previous call; however, the important piece was missing - that the datums themselves cannot be mutated (this is what we assume elsewhere and perform only the "shallow" copies). This commit clarifies the contract of the method and explicitly explains why we need to allocate fresh byte slices (which is done via `ByteAllocator` to reduce the number of allocations). Additional context can be found in cockroachdb#43145 which added this copy in the first place. Here is a relevant quote from Alfonso: ``` I think what's going on here is that this type of contract (you may only reuse the row until the next call) is a bit unclear. `CopyRow`s of `EncDatum`s are only implemented as shallow copies, so the reference to this reuse only applies to the `EncDatumRow`, but not to the `encoded` field, which is what is rewritten in this case. The `encoded` field will not be copied, so I think the tests are probably failing due to that. This is unfortunate and there doesn't seem to be a good reason for it. To implement deep copies, we will have to implement deep copies for `Datum`s as well. ``` I think we were under the impression that we'd implement the "deep copy" in `CopyRow`, but I highly doubt we'll do so given that we're mostly investing in the columnar infrastructure nowadays, and the current way of performing shallow copying has worked well long enough. Epic: None Release note: None
aa3ca46
to
170baed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @cucaroach and @DrewKimball)
pkg/sql/rowcontainer/disk_row_container.go
line 468 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit]
showing
->shoving
Done.
🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set. |
Pinged cla. bors r+ |
Stopped waiting for PR status (Github check) without running due to duplicate requests to run. You may check Bors to see that this PR is included in a batch by one of the other requests. |
Build failed (retrying...): |
Build succeeded: |
This commit addresses an old TODO about figuring out why we cannot reuse the same buffer in
diskRowIterator.Row
method. The contract of that method was previously confusing - it says that the call toRow
invalidates the row returned on the previous call; however, the important piece was missing - that the datums themselves cannot be mutated (this is what we assume elsewhere and perform only the "shallow" copies). This commit clarifies the contract of the method and explicitly explains why we need to allocate fresh byte slices (which is done viaByteAllocator
to reduce the number of allocations).Additional context can be found in #43145 which added this copy in the first place. Here is a relevant quote from Alfonso:
I think we were under the impression that we'd implement the "deep copy" in
CopyRow
, but I highly doubt we'll do so given that we're mostly investing in the columnar infrastructure nowadays, and the current way of performing shallow copying has worked well long enough.Epic: None
Release note: None