-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: PushTxn(PUSH_TIMESTAMP) without Raft consensus #94728
Labels
A-kv-transactions
Relating to MVCC and the transactional model.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Comments
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Jan 26, 2023
Fixes cockroachdb#94728. With the previous commit, transactions will check the timestamp cache before committing to determine whether they have had their commit timestamp pushed. This commit exploits this to avoid ever rewriting a transaction's record on a timestamp push. Instead, the timestamp cache is used, regardless of whether the record already existed or not. Doing so avoids consensus.
craig bot
pushed a commit
that referenced
this issue
Feb 2, 2023
95904: roachtest: normalize versions in multitenant-upgrade roachtest r=ajstorm a=healthy-pod If the multitenant-upgrade roachtest uses a mix of release/non-release binaries, it may be using versions that are technically the same but fail to confirm that because version in test binaries are incremented by 1M. This code change fixes the issue by normalizing versions before comparing them. Closes #95648 Epic: none Release note: None 95911: kv: perform PushTxn(PUSH_TIMESTAMP) without Raft consensus r=arulajmani a=nvanbenschoten This PR contains a sequence of three commits that combine to resolve #94728. ### check txn push marker on commit, not txn record creation The first commit moves the point when a transaction checks the timestamp cache for its minimum commit timestamp from transaction record creation time back to commit time. This allows us to use the timestamp cache to communicate successful `PushTxn(TIMESTAMP)` to a pushee with an existing record without rewriting its transaction record. For details, see the changes to the state machine diagram attached to `Replica.CanCreateTxnRecord` for a visual depiction of this change. ### always promote PUSH_TIMESTAMP to PUSH_ABORT on failed staging record The second commit simplifies logic in PushTxnRequest that promoted a `PUSH_TIMESTAMP` to a `PUSH_ABORT` when it found a STAGING transaction record that it knew to be part of a failed parallel commit attempt. The logic tried to be smart and minimize the cases where it needed to promote a `PUSH_TIMESTAMP` to a `PUSH_ABORT`. It was avoiding doing so if it had previously found an intent with a higher epoch. In practice, this optimization doesn't seem to matter. It was also making logic in a following commit harder to write because it was preserving cases where a `PUSH_TIMESTAMP` would succeed against a STAGING transaction record. We don't want to support such state transitions, so eliminate them. ### don't rewrite txn record on PushTxn(TIMESTAMP) With the previous two commits, transactions will check the timestamp cache before committing to determine whether they have had their commit timestamp pushed. The final commit exploits this to avoid ever rewriting a transaction's record on a timestamp push. Instead, the timestamp cache is used, regardless of whether the record already existed or not. Doing so avoids consensus. Release note: None 96115: schemachanger: Implement `DROP CONSTRAINT` in declarative schema changer r=Xiang-Gu a=Xiang-Gu This PR implements `ALTER TABLE t DROP CONSTRAINT cons_name` in declarative schema changer. Supported constraints include Checks, FK, and UniqueWithoutIndex. Dropping PK or Unique constraints will fall back to legacy schema changer, which in turn spits out an "not supported yet" error. Epic: None 96241: sql: use RLock in connExecutor.CancelQuery and connExecutor.CancelActiveQueries r=rafiss,yuzefovich a=ecwall Fixes #95994 `connExecutor.CancelQuery` and `connExecutor.CancelActiveQueries` do not modify `mu.ActiveQueries` or the `*queryMetas` inside so they can safely use `RLock` instead of `Lock`. Release note: None 96273: sql/schemachanger: forward fit compatibility changes for 22.2 rules r=fqazi a=fqazi Informs: #95849 Previously, some constraint-related rules in the 22.2 set incorrectly used logic for 23.X. This patch addresses those to get compatibility back. Additionally, some minor clean-up in rules-related helpers to ensure proper compatibility. With this change, a manual diff shows both branches are now equal in terms of rules (outside of renames). A roachtest will be coming soon to assert this. Epic: none Release note: None 96302: backupccl: add missing context cancel checks to restore r=stevendanna a=adityamaru In #95257 we saw a restore grind to a halt 2 hours into a 5 hour roachtest. The stacks indicated that we may have seen a context cancellation that was not being respected by the goroutine running `generateAndSendImportSpans`. This resulted in the `generative_split_and_scatter_processor` getting stuck writing to a channel nobody was reading from (https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/backupccl/restore_span_covering.go#L516) since the other goroutines in the processor had seen the ctx cancellation and exited. A side effect of the generative processor not shutting down was that the downstream restore data processors would also hang on their call to `input.Next()` as they would not receive a row or a meta from the generative processor signalling them to shutdown. This fix adds a ctx cancellation check to the goroutine described above, thereby allowing a graceful teardown of the flow. Informs: #95257 Release note (bug fix): fixes a bug where a restore flow could hang indefinitely in the face of a context cancellation, manifesting as a stuck restore job. Co-authored-by: healthy-pod <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Xiang Gu <[email protected]> Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Faizan Qazi <[email protected]> Co-authored-by: adityamaru <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-transactions
Relating to MVCC and the transactional model.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Currently, a
PushTxn
request that has a high-enough priority to push a transaction rewrites that transaction's record with a newwrite_timestamp
. This new write timestamp is treated as the minimum commit timestamp of the pushee.The rewrite incurs a Raft consensus round. As a result, reads need to perform writes to move conflicting transactions out of their way. This is undesirable.
We could avoid this by having the pusher bump the timestamp cache using the
transactionPushMarker
. The pushee could then check this timestamp cache entry when committing, even when its record already exists. This is similar to how things behave today (sincelazy_txn_record_creation.md
), except we currently only check thetransactionPushMarker
when creating the txn record. After this change, we would do so when committing.Jira issue: CRDB-23102
Epic CRDB-25218
The text was updated successfully, but these errors were encountered: