-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: optimize storage access for term when leader tries to commit #139609
raft: optimize storage access for term when leader tries to commit #139609
Conversation
bd7dc5b
to
3c697a7
Compare
e172a3c
to
d51426d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % nits. Also, please squash the PR into one commit.
ac9be23
to
2f92236
Compare
The code LGTM, but the release note seems off:
Please either remove it, or note something like:
(both in the commit and PR description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go once release notes and nits are fixed.
When the raft leader wants to commit an entry(advance commit index), it must check the term of the entry it wants to commit and ensure the term number is the same as the current leader term(term of itself). Previously, this process may involve a storage access for the term number if the entry has been persisted on stable storage and no longer in the unstable log of the leader node. In fact this above scenario happens quite often. Which makes an optimization improving leader commit latency fruitful. This PR introduces an optimization that utilizes an invariant that is logically equivalent to checking whether the want-to-be-committed entry has the same term as the leader doing the commit. This is done by keeping an entry index in leader node's memory that signifies the point in time where the current leader became leader. When the leader wants to commit an entry, the leader compares the want-to-be-committed entry's index(available in memory) with the newly added index in this PR to determine whether the term matches. Thus completely eliminating storage access on this code path. A corresponding unit test is also modified to account for the elimination of storage access when leader tries to commit. Jira issue: CRDB-45763 Fixes: cockroachdb#137826 Release note (performance improvement): Removed a potential storage read from raft commit pipeline. This reduces the worst-case KV write latency.
2f92236
to
465e772
Compare
TYFTR! bors r=pav-kv |
139609: raft: optimize storage access for term when leader tries to commit r=pav-kv a=hakuuww When the raft leader wants to commit an entry(advance commit index), it must check the term of the entry it wants to commit and ensure the term number is the same as the current leader term(term of itself). Previously, this process may involve a storage access for the term number if the entry has been persisted on stable storage and no longer in the unstable log of the leader node. In fact this above scenario happens quite often. Which makes an optimization improving leader commit latency fruitful. This PR introduces an optimization that utilizes an invariant that is logically equivalent to checking whether the want-to-be-committed entry has the same term as the leader doing the commit. This is done by keeping an entry index in leader node's memory that signifies the point in time where the current leader became leader. When the leader wants to commit an entry, the leader compares the want-to-be-committed entry's index(available in memory) with the newly added index in this PR to determine whether the term matches. Thus completely eliminating storage access on this code path. A corresponding unit test is also modified to account for the elimination of storage access when leader tries to commit. A previous related PR(merged): #139907 Jira issue: [CRDB-45763](https://cockroachlabs.atlassian.net/browse/CRDB-45763) Fixes: #137826 Release note (performance improvement): removed a potential storage read from raft commit pipeline. This reduces the worst-case KV write latency. Co-authored-by: Anthony Xu <[email protected]>
Build failed: |
Retry bors r=pav-kv |
bors retry |
Already running a review |
When the raft leader wants to commit an entry(advance commit index), it must check the term of the entry it wants to commit and ensure the term number is the same as the current leader term(term of itself).
Previously, this process may involve a storage access for the term number if the entry has been persisted on stable storage and no longer in the unstable log of the leader node.
In fact this above scenario happens quite often. Which makes an optimization improving leader commit latency fruitful.
This PR introduces an optimization that utilizes an invariant that is logically equivalent to checking whether the want-to-be-committed entry has the same term as the leader doing the commit.
This is done by keeping an entry index in leader node's memory that signifies the point in time where the current leader became leader. When the leader wants to commit an entry, the leader compares the want-to-be-committed entry's index(available in memory) with the newly added index in this PR to determine whether the term matches.
Thus completely eliminating storage access on this code path.
A corresponding unit test is also modified to account for the elimination of storage access when leader tries to commit.
A previous related PR(merged): #139907
Jira issue: CRDB-45763
Fixes: #137826
Release note (performance improvement): removed a potential storage read from raft commit pipeline. This reduces the worst-case KV write latency.