kv: PushTxn(PUSH_TIMESTAMP) without Raft consensus #94728

nvanbenschoten · 2023-01-04T22:50:45Z

Currently, a PushTxn request that has a high-enough priority to push a transaction rewrites that transaction's record with a new write_timestamp. This new write timestamp is treated as the minimum commit timestamp of the pushee.

The rewrite incurs a Raft consensus round. As a result, reads need to perform writes to move conflicting transactions out of their way. This is undesirable.

We could avoid this by having the pusher bump the timestamp cache using the transactionPushMarker. The pushee could then check this timestamp cache entry when committing, even when its record already exists. This is similar to how things behave today (since lazy_txn_record_creation.md), except we currently only check the transactionPushMarker when creating the txn record. After this change, we would do so when committing.

Jira issue: CRDB-23102

Epic CRDB-25218

The text was updated successfully, but these errors were encountered:

Fixes cockroachdb#94728. With the previous commit, transactions will check the timestamp cache before committing to determine whether they have had their commit timestamp pushed. This commit exploits this to avoid ever rewriting a transaction's record on a timestamp push. Instead, the timestamp cache is used, regardless of whether the record already existed or not. Doing so avoids consensus.

95904: roachtest: normalize versions in multitenant-upgrade roachtest r=ajstorm a=healthy-pod If the multitenant-upgrade roachtest uses a mix of release/non-release binaries, it may be using versions that are technically the same but fail to confirm that because version in test binaries are incremented by 1M. This code change fixes the issue by normalizing versions before comparing them. Closes #95648 Epic: none Release note: None 95911: kv: perform PushTxn(PUSH_TIMESTAMP) without Raft consensus r=arulajmani a=nvanbenschoten This PR contains a sequence of three commits that combine to resolve #94728. ### check txn push marker on commit, not txn record creation The first commit moves the point when a transaction checks the timestamp cache for its minimum commit timestamp from transaction record creation time back to commit time. This allows us to use the timestamp cache to communicate successful `PushTxn(TIMESTAMP)` to a pushee with an existing record without rewriting its transaction record. For details, see the changes to the state machine diagram attached to `Replica.CanCreateTxnRecord` for a visual depiction of this change. ### always promote PUSH_TIMESTAMP to PUSH_ABORT on failed staging record The second commit simplifies logic in PushTxnRequest that promoted a `PUSH_TIMESTAMP` to a `PUSH_ABORT` when it found a STAGING transaction record that it knew to be part of a failed parallel commit attempt. The logic tried to be smart and minimize the cases where it needed to promote a `PUSH_TIMESTAMP` to a `PUSH_ABORT`. It was avoiding doing so if it had previously found an intent with a higher epoch. In practice, this optimization doesn't seem to matter. It was also making logic in a following commit harder to write because it was preserving cases where a `PUSH_TIMESTAMP` would succeed against a STAGING transaction record. We don't want to support such state transitions, so eliminate them. ### don't rewrite txn record on PushTxn(TIMESTAMP) With the previous two commits, transactions will check the timestamp cache before committing to determine whether they have had their commit timestamp pushed. The final commit exploits this to avoid ever rewriting a transaction's record on a timestamp push. Instead, the timestamp cache is used, regardless of whether the record already existed or not. Doing so avoids consensus. Release note: None 96115: schemachanger: Implement `DROP CONSTRAINT` in declarative schema changer r=Xiang-Gu a=Xiang-Gu This PR implements `ALTER TABLE t DROP CONSTRAINT cons_name` in declarative schema changer. Supported constraints include Checks, FK, and UniqueWithoutIndex. Dropping PK or Unique constraints will fall back to legacy schema changer, which in turn spits out an "not supported yet" error. Epic: None 96241: sql: use RLock in connExecutor.CancelQuery and connExecutor.CancelActiveQueries r=rafiss,yuzefovich a=ecwall Fixes #95994 `connExecutor.CancelQuery` and `connExecutor.CancelActiveQueries` do not modify `mu.ActiveQueries` or the `*queryMetas` inside so they can safely use `RLock` instead of `Lock`. Release note: None 96273: sql/schemachanger: forward fit compatibility changes for 22.2 rules r=fqazi a=fqazi Informs: #95849 Previously, some constraint-related rules in the 22.2 set incorrectly used logic for 23.X. This patch addresses those to get compatibility back. Additionally, some minor clean-up in rules-related helpers to ensure proper compatibility. With this change, a manual diff shows both branches are now equal in terms of rules (outside of renames). A roachtest will be coming soon to assert this. Epic: none Release note: None 96302: backupccl: add missing context cancel checks to restore r=stevendanna a=adityamaru In #95257 we saw a restore grind to a halt 2 hours into a 5 hour roachtest. The stacks indicated that we may have seen a context cancellation that was not being respected by the goroutine running `generateAndSendImportSpans`. This resulted in the `generative_split_and_scatter_processor` getting stuck writing to a channel nobody was reading from (https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/backupccl/restore_span_covering.go#L516) since the other goroutines in the processor had seen the ctx cancellation and exited. A side effect of the generative processor not shutting down was that the downstream restore data processors would also hang on their call to `input.Next()` as they would not receive a row or a meta from the generative processor signalling them to shutdown. This fix adds a ctx cancellation check to the goroutine described above, thereby allowing a graceful teardown of the flow. Informs: #95257 Release note (bug fix): fixes a bug where a restore flow could hang indefinitely in the face of a context cancellation, manifesting as a stuck restore job. Co-authored-by: healthy-pod <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Xiang Gu <[email protected]> Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Faizan Qazi <[email protected]> Co-authored-by: adityamaru <[email protected]>

nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-transactions Relating to MVCC and the transactional model. T-kv KV Team labels Jan 4, 2023

nvanbenschoten mentioned this issue Jan 4, 2023

kv: ignore pushed intent without Raft consensus #94730

Open

nvanbenschoten mentioned this issue Jan 26, 2023

kv: perform PushTxn(PUSH_TIMESTAMP) without Raft consensus #95911

Merged

craig bot closed this as completed in 3dbb321 Feb 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: PushTxn(PUSH_TIMESTAMP) without Raft consensus #94728

kv: PushTxn(PUSH_TIMESTAMP) without Raft consensus #94728

nvanbenschoten commented Jan 4, 2023 •

edited by exalate-issue-sync bot

Loading

kv: PushTxn(PUSH_TIMESTAMP) without Raft consensus #94728

kv: PushTxn(PUSH_TIMESTAMP) without Raft consensus #94728

Comments

nvanbenschoten commented Jan 4, 2023 • edited by exalate-issue-sync bot Loading

nvanbenschoten commented Jan 4, 2023 •

edited by exalate-issue-sync bot

Loading