storage: simplify scatter implementation #17644

petermattis · 2017-08-15T00:58:54Z

Rather than using custom logic in Replica.adminScatter, call into
replicateQueue.processOneChange until no further action needs to be
taken. This fixes a major caveat with the prior implementation: replicas
were not removed synchronously leading to conditions where an arbitrary
number of replicas could be added to a range via scattering.

Fixes #17612

cockroach-teamcity · 2017-08-15T00:59:04Z

This change is

petermattis · 2017-08-15T01:00:28Z

All of the tests still pass, which makes me suspect that scatter isn't well tested.

@benesch I'm curious if you can run this on restore workloads to see if it has any negative impact. My suspicion is that it will be fine, but I'd prefer to be cautious at this time in the cycle.

petermattis · 2017-08-15T01:15:48Z

I should mention that I've been testing this on sky and it provides near ideal scattering in my testing. And the implementation seems easier to understand conceptually. In effect, it temporarily provides a dedicated goroutine to the replicate queue for the specified replica.

benesch · 2017-08-15T01:22:48Z

Yes, the one scatter test (see split_at_test.go) is currently disabled because it's too flaky to be useful. The only tests that exercise the code path in any useful fashion are the restore production tests.

…

On Mon, Aug 14, 2017 at 9:16 PM Peter Mattis ***@***.***> wrote: I should mention that I've been testing this on sky and it provides near ideal scattering in my testing. And the implementation seems easier to understand conceptually. In effect, it temporarily provides a dedicated goroutine to the replicate queue for the specified replica. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#17644 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA15IA_4pLPluN7SUaVGqif7p8CcJs4Uks5sYPFRgaJpZM4O3Ef_> .

a-robinson · 2017-08-15T03:33:04Z

LGTM. I like the simplicity of this much more than what was there before, but I'd wait on @benesch to test how it works for restore in practice (and perhaps double check what was causing problems in the original scatter implementation). This still suffers from the problems in #17341, but that's a feature more than a bug at this point in the release cycle.

Reviewed 2 of 2 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.

Comments from Reviewable

petermattis · 2017-08-15T11:18:24Z

This might actually speed up RESTORE as now we'll only ever be restoring into 3 replicas while previously we could be restoring into 6. 🤞

benesch · 2017-08-15T14:53:53Z

Running restore tests now!

benesch · 2017-08-15T15:02:31Z

Well, this is odd:

petermattis · 2017-08-15T15:06:41Z

@benesch Can you try running again with set cluster setting kv.allocator.stat_based_rebalancing.enabled = false?

benesch · 2017-08-15T15:23:49Z

For posterity, the whole thing took 28m. It usually takes about 15m, I think. Here's the final state:

Rerunning without stats rebalancing now.

benesch · 2017-08-15T15:56:08Z

Bingo. Though now I don't know whether to be happy or worried.

If you don't mind, I want to run two controls before stamping this.

petermattis · 2017-08-15T16:17:58Z

Yep, take your time.

@a-robinson This is evidence towards #17645 or that stats-based rebalancing still needs tuning.

bdarnell · 2017-08-15T16:20:32Z

Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.

pkg/storage/replica_command.go, line 3995 at r1 (raw file):

	var allowLeaseTransfer bool
	for re := retry.StartWithCtx(ctx, retryOpts); re.Next(); {
		canTransferLease := func() bool { return false }

You can do canTransferLease := func() bool { return allowLeaseTransfer } outside the loop (but I guess that forces a heap allocation? If that's significant here add a comment so someone doesn't try to simplify it in the future).

Comments from Reviewable

Rather than using custom logic in Replica.adminScatter, call into replicateQueue.processOneChange until no further action needs to be taken. This fixes a major caveat with the prior implementation: replicas were not removed synchronously leading to conditions where an arbitrary number of replicas could be added to a range via scattering. Fixes cockroachdb#17612

petermattis · 2017-08-15T16:23:30Z

Review status: 1 of 2 files reviewed at latest revision, 1 unresolved discussion.

pkg/storage/replica_command.go, line 3995 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

You can do canTransferLease := func() bool { return allowLeaseTransfer } outside the loop (but I guess that forces a heap allocation? If that's significant here add a comment so someone doesn't try to simplify it in the future).

Clever. I can't imagine the heap allocation is problematic here. Changed.

Comments from Reviewable

benesch · 2017-08-15T17:42:55Z

Ok, seems to match TPCH10 on a recent master (301e5fc) without stats-based rebalancing:

Things seem really broken with stats-based rebalancing:

In both cases the restore took about 22m. Turns out there's a default cluster setting limiting the RESTORE write rate to 30MB/s, which is why these aren't taking the 15m I was expecting.

So the new scatter seems as good as the old one. LGTM! 🎉

benesch · 2017-08-15T17:44:13Z

@petermattis, we should shrink the test you've been running on sky and turn it into an acceptance test for scatter.

petermattis · 2017-08-15T17:53:42Z

@benesch Yes, scatter needs better testing. The test would essentially be kv --concurrency 512 --splits 1000000 --max-ops 1. Question is how to determine whether scatter worked well. I suppose we could look at the number of replicas vs ranges and the balance across the cluster.

benesch · 2017-08-15T18:06:45Z

Yeah, I think the criteria for success is actually straightforward to measure. I'll file an issue.

benesch · 2017-08-15T18:07:15Z

Never mind. I just saw #17660.

petermattis requested a review from a team August 15, 2017 00:58

petermattis requested review from benesch and a-robinson August 15, 2017 00:59

petermattis force-pushed the pmattis/scatter branch from d20435c to bb145a0 Compare August 15, 2017 01:13

petermattis mentioned this pull request Aug 15, 2017

storage: transitive prerequisites are not always transferred on command cancellation #16266

Closed

a-robinson mentioned this pull request Aug 15, 2017

storage: Re-add replica to replicate queue after rebalancing #17648

Closed

petermattis force-pushed the pmattis/scatter branch from bb145a0 to 7d668a3 Compare August 15, 2017 16:23

benesch approved these changes Aug 15, 2017

View reviewed changes

petermattis merged commit 779d80e into cockroachdb:master Aug 15, 2017

petermattis deleted the pmattis/scatter branch August 15, 2017 17:56

petermattis mentioned this pull request Aug 15, 2017

storage: scatter acceptance test #17660

Closed

This was referenced Aug 15, 2017

storage: scatter fails with "pending snapshot already present" #17533

Closed

storage: stats-based rebalancing can't handle large numbers of splits and scatters #17671

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: simplify scatter implementation #17644

storage: simplify scatter implementation #17644

petermattis commented Aug 15, 2017

cockroach-teamcity commented Aug 15, 2017

petermattis commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017 via email

a-robinson commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017

petermattis commented Aug 15, 2017

bdarnell commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017 •

edited

Loading

benesch commented Aug 15, 2017 •

edited

Loading

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017

storage: simplify scatter implementation #17644

storage: simplify scatter implementation #17644

Conversation

petermattis commented Aug 15, 2017

cockroach-teamcity commented Aug 15, 2017

petermattis commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017 via email

a-robinson commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017

petermattis commented Aug 15, 2017

bdarnell commented Aug 15, 2017

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017 • edited Loading

benesch commented Aug 15, 2017 • edited Loading

petermattis commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017

benesch commented Aug 15, 2017 •

edited

Loading

benesch commented Aug 15, 2017 •

edited

Loading