forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
storage: foundational work for follower reads using replica closed ti…
…mestamps This change lays the ground work for follower reads by adding the prototype implementation into our main codebase Most of the logic is disabled by default. It is only exercised during specific unit tests or when running with a nonzero COCKROACH_CLOSED_TIMESTAMP_INTERVAL. The code contains several potential correctness anomalies and makes no attempt at handling lagging followers gracefully. It should not be used outside of testing and in fact should not be used at all (though we may want to write roachtests early). The [follower reads RFC] and many TODOs in this code hint at the upcoming changes. Most prominently, known correctness gotchas will be addressed, but the role of quiescence and coalesced heartbeats will be untangled from the main proposal, which hopefully can clarify the code somewhat as well. In the meantime, the commit message below documents what is implemented here, even though it is subject to change: Nodes send a closed timestamp with coalesced heartbeats. Receipt of a heartbeat from a node which is the leaseholder for the range means a closed timestamp can be trusted to apply to each follower replica which has committed at or over a min lease applied index, a new value supplied with coalesced heartbeats. Nodes keep track of their "min proposal timestamp" (MinPropTS), which is an HLC timestamp. On every heartbeat, the MinPropTS is persisted locally to ensure monotonicity on node restart. At startup, a node reads the last persisted MinPropTS, and forwards the HLC clock to the MPT timestamp + max safe interval if necessary. Nodes check MinPropTS on command evaluation; a command's timestamp is forwarded if less than MinPropTS. Things get more interesting when a range quiesces. Replicas of quiesced ranges no longer receive info on coalesced heartbeats. However, if a replica is quiesced, we can continue to rely on the most recent store-wide closed timestamp supplied with coalesced heartbeats, so long as the liveness epoch (reported with heartbeats) remains stable and no heartbeats are skipped. This can continue for as long as a range is quiesced, but requires that the leaseholder notifies all followers on the first heartbeat after a range is unquiesced. Note there is no concern that on leaseholder change, the new leaseholder allows a write at an earlier timestamp than a previously reported closed timestamp. This is due to the low water timestamp in the timestamp cache being reset on leaseholder transfer to prevent rewriting history in general. Release note: None [follower reads RFC]: cockroachdb#26362
- Loading branch information
Showing
22 changed files
with
1,839 additions
and
309 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.