-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: use sensible zone configs for critical system and time series ranges #14990
Comments
A sensible choice would be to replicate these ranges with factor |
via @bdarnell: yeah. we should eventually move to more fine-grained defaults for different system sub-ranges (high replication factor and long ttl for range metadata, high replication and low ttl for liveness, low replication for timeseries) |
The |
Default the .meta zone config to 5 replicas and 1h GC TTL. The higher replication reflects the relative danger of significant data loss and unavailability for the meta ranges. The shorter GC TTL reflects the lack of need for ever performing historical queries on these ranges coupled with the desire to keep the meta ranges smaller. See cockroachdb#16266 See cockroachdb#14990
@petermattis this seems like a risky change at this point; should this move to 1.2? |
Yes, we already agreed to move this to 1.2 in #17628 |
Default the .meta zone config to 1h GC TTL. The shorter GC TTL reflects the lack of need for ever performing historical queries on these ranges coupled with the desire to keep the meta ranges smaller. See cockroachdb#16266 See cockroachdb#14990
Default the .meta zone config to 1h GC TTL and default the .liveness zone config to 1m GC TTL. The shorter GC TTLs reflect the lack of need for ever performing historical queries on these ranges coupled with the desire to keep the meta and liveness ranges smaller. See cockroachdb#16266 See cockroachdb#14990
Default the .meta zone config to 1h GC TTL and default the .liveness zone config to 10m GC TTL. The shorter GC TTLs reflect the lack of need for ever performing historical queries on these ranges coupled with the desire to keep the meta and liveness ranges smaller. See cockroachdb#16266 See cockroachdb#14990
I added a docs-todo for a known limitation about this, see #14990. |
Filed #28901 to track cascading zone configs. |
@m-schneider Is there anything left to do here? |
No, closing. |
I'm a bit confused. I assume that #27349 is the PR that closes this issue (though it didn't refer it prior to this commit). That PR is sparse on description, but from the code it looks like what should happen is that if I set up a new five node cluster, I'll get my critical system ranges 5x replicated. That doesn't seem to be the case:
@m-schneider are my assumptions wrong? What should happen in my situation? |
Taking a look. |
How did you start up the cluster? I tried to reproduce on master, but I got the expected behavior:
|
I used |
Let me try that again.. it'd be puzzling if that made a difference. |
Hrm. Unfortunately, it behaves as advertised now. I wonder what the difference was? How's our test coverage for this? |
We have pretty extensive testing in allocator_test.go for various combinations of available vs alive nodes. |
That doesn't necessarily mean that it always works end-to-end, though. @tschottdorf do you know which version of cockroach you were on and whether roachdemo was resuming a preexisting cluster or if it initialized a new one? If the cluster was initialized before #27349 / #30480 then you wouldn't see the new behavior regardless of which version you were using. |
I haven't been able to reproduce after a couple of attempts and after using roachdemo. Should we close for now? If you see this again can you please run: |
@a-robinson I looked after reading your comment two days ago but the buffer had been lost. I also gave this a a few more rounds but it just worked. Still very weird. I |
While it makes sense for small clusters, our current default zone configuration that only creates 3 replicas of all data by default is somewhat risky for large clusters, where it may be preferable to keep more than 3 replicas of critical system ranges.
This can be addressed via documentation for 1.0 (cockroachdb/docs#1280, cockroachdb/docs#1248), but before 1.1 we should do some testing with different configurations and consider setting more replicas of the system ranges by default.
The text was updated successfully, but these errors were encountered: