Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: use RLock in connExecutor.CancelQuery and connExecutor.CancelActiveQueries #95994

Closed
ecwall opened this issue Jan 26, 2023 · 0 comments · Fixed by #96241
Closed

sql: use RLock in connExecutor.CancelQuery and connExecutor.CancelActiveQueries #95994

ecwall opened this issue Jan 26, 2023 · 0 comments · Fixed by #96241
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@ecwall
Copy link
Contributor

ecwall commented Jan 26, 2023

Context from previous PR
https://reviewable.io/reviews/cockroachdb/cockroach/95745#-NMeL5RC977TPNkowJTH

Jira issue: CRDB-23887

@ecwall ecwall added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels Jan 26, 2023
@ecwall ecwall self-assigned this Jan 26, 2023
craig bot pushed a commit that referenced this issue Feb 2, 2023
95904: roachtest: normalize versions in multitenant-upgrade roachtest r=ajstorm a=healthy-pod

If the multitenant-upgrade roachtest uses a mix
of release/non-release binaries, it may be using versions
that are technically the same but fail to confirm that because
version in test binaries are incremented by 1M.

This code change fixes the issue by normalizing versions before
comparing them.

Closes #95648

Epic: none
Release note: None

95911: kv: perform PushTxn(PUSH_TIMESTAMP) without Raft consensus r=arulajmani a=nvanbenschoten

This PR contains a sequence of three commits that combine to resolve #94728.

### check txn push marker on commit, not txn record creation

The first commit moves the point when a transaction checks the timestamp cache for its minimum commit timestamp from transaction record creation time back to commit time. This allows us to use the timestamp cache to communicate successful `PushTxn(TIMESTAMP)` to a pushee with an existing record without rewriting its transaction record.

For details, see the changes to the state machine diagram attached to `Replica.CanCreateTxnRecord` for a visual depiction of this change.

### always promote PUSH_TIMESTAMP to PUSH_ABORT on failed staging record 

The second commit simplifies logic in PushTxnRequest that promoted a `PUSH_TIMESTAMP` to a `PUSH_ABORT` when it 
 found a STAGING transaction record that it knew to be part of a failed parallel commit attempt.

The logic tried to be smart and minimize the cases where it needed to promote a `PUSH_TIMESTAMP` to a `PUSH_ABORT`. It was avoiding doing so if it had previously found an intent with a higher epoch. In practice, this optimization doesn't seem to matter. It was also making logic in a following commit harder to write because it was preserving cases where a `PUSH_TIMESTAMP` would succeed against a STAGING transaction record. We don't want to support such state transitions, so eliminate them.

### don't rewrite txn record on PushTxn(TIMESTAMP)

With the previous two commits, transactions will check the timestamp cache before committing to determine whether they have had their commit timestamp pushed. The final commit exploits this to avoid ever rewriting a transaction's record on a timestamp push. Instead, the timestamp cache is used, regardless of whether the record already existed or not. Doing so avoids consensus.

Release note: None

96115: schemachanger: Implement `DROP CONSTRAINT` in declarative schema changer r=Xiang-Gu a=Xiang-Gu

This PR implements `ALTER TABLE t DROP CONSTRAINT cons_name` in declarative schema changer.

Supported constraints include Checks, FK, and UniqueWithoutIndex.

Dropping PK or Unique constraints will fall back to legacy schema changer, which in turn spits out an "not supported yet" error.

Epic: None

96241: sql: use RLock in connExecutor.CancelQuery and connExecutor.CancelActiveQueries r=rafiss,yuzefovich a=ecwall

Fixes #95994

`connExecutor.CancelQuery` and `connExecutor.CancelActiveQueries` do not modify `mu.ActiveQueries` or the `*queryMetas` inside so they can safely use `RLock` instead of `Lock`.

Release note: None

96273: sql/schemachanger: forward fit compatibility changes for 22.2 rules r=fqazi a=fqazi

Informs: #95849

Previously, some constraint-related rules in the 22.2 set incorrectly used logic for 23.X. This patch addresses those to get compatibility back. Additionally, some minor clean-up in rules-related helpers to ensure proper compatibility.

With this change, a manual diff shows both branches are now equal in terms of rules (outside of renames). A roachtest will be coming soon to assert this.

Epic: none
Release note: None

96302: backupccl: add missing context cancel checks to restore r=stevendanna a=adityamaru

In #95257 we saw a restore grind to a halt 2 hours into a 5 hour roachtest. The stacks indicated that we may have seen a context cancellation that was not being respected by the goroutine running `generateAndSendImportSpans`. This resulted in the `generative_split_and_scatter_processor` getting stuck writing to a channel nobody was reading from
(https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/backupccl/restore_span_covering.go#L516) since the other goroutines
in the processor had seen the ctx cancellation and exited. A side effect of the generative processor not shutting down was that the downstream restore data processors would also hang on their call to `input.Next()` as they would not receive a row or a meta from the generative processor signalling them to shutdown. This fix adds a ctx cancellation check to the goroutine described above, thereby allowing a
graceful teardown of the flow.

Informs: #95257

Release note (bug fix): fixes a bug where a restore flow could hang indefinitely in the face of a context cancellation, manifesting as a stuck restore job.

Co-authored-by: healthy-pod <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
Co-authored-by: adityamaru <[email protected]>
@craig craig bot closed this as completed in 38299ef Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant