storage: stop obsering the commit timestamp in the merge txn #44090

ajwerner · 2020-01-16T21:44:42Z

Before this PR we're observe the commit timestamp in the merge transaction to
prevent automatic retries at a higher epoch by the call to client.DB.Txn.
Fortunately it's easy to detect cases when the transaction is restarted
internally by looking at the epoch.

The concern about this observance of the commit timestamp is that if the closed
timestamp interval is set to a very small value, smaller than the replication
latency, then merges might never be able to commit.

Release note: None

cockroach-teamcity · 2020-01-16T21:44:49Z

This change is

Before this PR we're observe the commit timestamp in the merge transaction to supposedly prevent automatic retries at a higher epoch by the call to client.DB.Txn. However we weren't using client.DB.Txn so it's not clear what this observance is giving us. One note is that it'd be problematic if we tried to refresh on the RHS as it would not serve our request. That being, said, the only key we read from the RHS is the descriptor which we also write to so we shouldn't need to refresh. This is a little subtle and deserves a comment. The concern about this observance of the commit timestamp is that if the closed timestamp interval is set to a very small value, smaller than the replication latency, then merges might never be able to commit. Release note: None

ajwerner · 2020-01-28T16:01:54Z

This one seems to have one problem which we sort of anticipated. The problem arises when the range being merged away is the meta2 range which contains its own addressing. In this case we attempt to refresh the CPut which removes the entry for the RHS. We end up getting blocked on the subsumed replica and we wait for the merge to happen. I guess I am a little confused as to why we need to refresh the key we read given we already successfully CPut to it.

After some offline discussion with @andreimatei it seems like we could add some smarts to the txnSpanRefresher to remove the key from the refresh spans in the case of successful point writes so long as the refresh spans are exact. I sort of expected to find this logic somewhere but didn't. If we do this then we should probably expose a mechanism to ensure that the refresh spans stay exact because we'll be relying on this behavior for correctness.

@nvanbenschoten I recall us having a conversation about not needing to refresh keys which we've already written to but I couldn't find the logic. Did I misunderstand?

In cockroachdb#42650 I had intended to remove all places where the commit timestamp was observed in TRUNCATE. I missed this case. Modification times for the first write to table descriptors as of 19.2 should be the zero value (or the commit timestamp, so this isn't wrong). This observance of the commit timestamp prevents transactions which TRUNCATE tables from being pushed. Combined with cockroachdb#44090 we'll have removed all non-user observation of commit timestamps. Release note (bug fix): Long running transactions which attempt to TRUNCATE can now be pushed and will commit in cases where they previously could fail or retry forever.

44091: sql: stop observing the commit timestamp in TRUNCATE r=nvanbenschoten a=ajwerner In #42650 I had intended to remove all places where the commit timestamp was observed in TRUNCATE. I missed this case. Modification times for the first write to table descriptors as of 19.2 should be the zero value (or the commit timestamp, so this isn't wrong). This observance of the commit timestamp prevents transactions which TRUNCATE tables from being pushed. Combined with #44090 we'll have removed all non-user observation of commit timestamps. Release note (bug fix): Long running transactions which attempt to TRUNCATE can now be pushed and will commit in cases where they previously could fail or retry forever. Co-authored-by: Andrew Werner <[email protected]>

ajwerner mentioned this pull request Jan 16, 2020

sql: stop observing the commit timestamp in TRUNCATE #44091

Merged

ajwerner force-pushed the ajwerner/stop-observing-commit-timestamp-in-merge-txn branch from b15197b to 7a393aa Compare January 16, 2020 22:09

ajwerner force-pushed the ajwerner/stop-observing-commit-timestamp-in-merge-txn branch from 7a393aa to 72ae6bf Compare January 28, 2020 14:23

nvanbenschoten mentioned this pull request Jan 22, 2021

kv: allow range merge txn to be pushed, even after lease transfer #59308

Closed

tbg added the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021

ajwerner closed this Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: stop obsering the commit timestamp in the merge txn #44090

storage: stop obsering the commit timestamp in the merge txn #44090

ajwerner commented Jan 16, 2020

cockroach-teamcity commented Jan 16, 2020

ajwerner commented Jan 28, 2020

storage: stop obsering the commit timestamp in the merge txn #44090

storage: stop obsering the commit timestamp in the merge txn #44090

Conversation

ajwerner commented Jan 16, 2020

cockroach-teamcity commented Jan 16, 2020

ajwerner commented Jan 28, 2020