-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Fix rounding for determining max number of failing zones #6896
Conversation
@@ -69,7 +69,7 @@ With a replication factor of 3, which is the default, deploy the Grafana Mimir c | |||
Deploying Grafana Mimir clusters to more zones than the configured replication factor does not have a negative impact. | |||
Deploying Grafana Mimir clusters to fewer zones than the configured replication factor can cause writes to the replica to be missed, or can cause writes to fail completely. | |||
|
|||
If there are fewer than `floor(replication factor / 2)` zones with failing replicas, reads and writes can withstand zone failures. | |||
If there are fewer than `ceil(replication factor / 2)` zones with failing replicas, reads and writes can withstand zone failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire phrasing is confusing and backwards to me. Instead of saying "fewer than X zones with failing replicas" can we say "There can be at most X zones with failing replicas otherwise reads and writes will fail"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree w/ your sentiment @56quarters. I'll try to revise into something more straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote it according to your suggestion, PTAL.
478f8f4
to
cd3891d
Compare
@@ -69,7 +69,7 @@ With a replication factor of 3, which is the default, deploy the Grafana Mimir c | |||
Deploying Grafana Mimir clusters to more zones than the configured replication factor does not have a negative impact. | |||
Deploying Grafana Mimir clusters to fewer zones than the configured replication factor can cause writes to the replica to be missed, or can cause writes to fail completely. | |||
|
|||
If there are fewer than `floor(replication factor / 2)` zones with failing replicas, reads and writes can withstand zone failures. | |||
There can be at most `floor(replication factor / 2)` zones with failing replicas, otherwise reads and writes will fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this formula is still wrong. We need a majority of zones available to accept reads and writes.
With a replication factor = 3
, this formula gives us floor(3 / 2) = 1
one zone that can fail -- that's correct. However, with replication factor = 4
, this formula gives us floor(4 / 2) = 2
two zones that can fail -- that's not correct. We need 3 zones with a replication factor of 4 (even though we don't recommend even replication factors). With replication factor = 5
, this formula gives us floor(5 / 2) = 2
two zones that can fail -- correct.
I believe the correct formula is "there can be at most floor((replication factor - 1) / 2)
zones with failing replicas".
Repeating the above scenario:
With a replication factor = 3
, this formula gives us floor((3 - 1) / 2) = 1
one zone that can fail -- that's correct. With replication factor = 4
, this formula gives us floor((4 - 1) / 2) = 1
one zone that can fail -- still correct. With replication factor = 5
, this formula gives us floor((5 - 1) / 2) = 2
two zones that can fail -- still correct.
Please double check my reasoning and math.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always thought of Mimir failure tolerance in terms of quorum because I first encountered this operating etcd. I remembered their docs being pretty good about this aspect so I dug them out:
https://etcd.io/docs/v3.3/faq/#why-an-odd-number-of-cluster-members:
An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state.
For a cluster with n members, quorum is (n/2)+1.
https://etcd.io/docs/v3.3/faq/#what-is-failure-tolerance
Cluster Size | Majority | Failure Tolerance |
---|---|---|
1 | 1 | 0 |
2 | 2 | 0 |
3 | 2 | 1 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 2 |
7 | 4 | 3 |
8 | 5 | 3 |
9 | 5 | 4 |
I think the table is a really handy way of covering both sides (quorum and failure tolerance) and is easy to read for those less comfortable with the formula.
cd3891d
to
b825793
Compare
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
b825793
to
aed7de7
Compare
Superseded by #9512 |
What this PR does
In docs, fix formula for determining max number of failing zones. The current phrasing combined with downwards rounding (
floor
) means you may end up requiring too few failing zones. E.g., if RF is 3, we would require fewer than 1 failing zone, while it should be max 1.Examples:
floor(3 / 2)
) failing zonefloor(6/2)
) failing zonesfloor(9/2)
) failing zonesWhich issue(s) this PR fixes or relates to
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.