Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: make load-based replica rebalancing decisions at the store level #28852

Merged
merged 5 commits into from
Sep 6, 2018

Conversation

a-robinson
Copy link
Contributor

Built on top of #28340, which is where the first 3 commits are from. This is still somewhat incomplete, in that it's missing unit tests and I'm only just now running tpc-c 10k on it. Sending out now to start the discussion of whether to include it in 2.1, since we're obviously very late in the intended development cycle.

@a-robinson a-robinson requested review from BramGruneir, nvanbenschoten and a team August 20, 2018 18:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@a-robinson
Copy link
Contributor Author

I take back what I said about this not working on tpc-c. As mentioned on slack, the problem was that running with --vmodule enabled destroys performance even when it doesn't cause many more logs to be written.

Results of a one hour tpc-c 10k run on 30 nodes:

$ roachprod ssh $CLUSTERNAME:31 "ulimit -n 10000 && ./workload.LATEST run tpcc --warehouses=10000 --ramp=600s --duration=3600s --tolerate-errors {pgurl:1-30}"
[...]
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 3600.1s        4         766336          212.9    301.1    268.4    536.9    872.4 103079.2  delivery
 3600.1s        4        7632043         2119.9    251.5    209.7    469.8    771.8 103079.2  newOrder
 3600.1s        4         766185          212.8     29.1     22.0     75.5    104.9   2550.1  orderStatus
 3600.1s        4        7662205         2128.3    194.4    151.0    402.7    738.2 103079.2  payment
 3600.1s        4         766331          212.9    108.7     96.5    184.5    268.4 103079.2  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 3600.1s        4       17593100         4886.8    212.9    176.2    436.2    738.2 103079.2
Audit check 9.2.1.7: PASS
Audit check 9.2.2.5.1: PASS
Audit check 9.2.2.5.2: PASS
Audit check 9.2.2.5.3: PASS
Audit check 9.2.2.5.4: PASS
Audit check 9.2.2.5.5: PASS
Audit check 9.2.2.5.6: PASS

_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
 3600.1s   127219.4  98.9%    251.5    209.7    385.9    469.8    771.8 103079.2

The pMax is not ideal. I'm not sure what's the cause of the outlier(s). But otherwise this is a good sign that more tests/polish are justified.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: but mostly just 👨‍🔬 🐶 on the StoreRebalancer changes. @BramGruneir do you mind taking a look at that?

Reviewed 6 of 6 files at r4, 1 of 1 files at r5, 1 of 1 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/allocator.go, line 383 at r4 (raw file):

		existingReplicas: len(existing),
		aliveStores:      aliveStoreCount,
		throttledStores:  throttledStoreCount,

nit: always 0, right?

Copy link
Contributor Author

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nvanbenschoten. Waiting on @BramGruneir isn't ideal since he's out this week, but I'll at least wait until I have metrics/debugging hooked up before merging, and will definitely wait for his review before cherrypicking.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/storage/allocator.go, line 383 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: always 0, right?

True, although I'd prefer to leave as is rather than assume it's always 0 in case the condition above ever gets removed.

@a-robinson a-robinson force-pushed the store-rebalance2 branch 4 times, most recently from eacd889 to c2a2041 Compare August 31, 2018 20:01
@a-robinson a-robinson requested a review from a team September 5, 2018 16:56
@a-robinson
Copy link
Contributor Author

Plans for improving RelocateRange outlined in #29130

Copy link
Member

@BramGruneir BramGruneir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this another full read through.

LGTM

Reviewed 3 of 3 files at r1, 9 of 9 files at r2, 3 of 3 files at r3, 1 of 6 files at r4, 13 of 13 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Follow-up to cockroachdb#28340, which did this for just leases.

Fixes cockroachdb#17979

Release note (performance improvement): Range replicas will be
automatically rebalanced throughout the cluster to even out the amount
of QPS being handled by each node.
This leaves properly cleaning up the code for later, but ensures that
the existing cluster setting will enable store-level rebalancing rather
than the old experimental write/disk-based rebalancing.

Release note: None
It's identical to the test for load-based lease rebalancing, just with
more than 3 nodes such that replicas must be rebalanced in addition to
leases in order for load to be properly spread across all nodes.

Release note: None
This cleans up all the old code, settings, and tests without massively
overhauling the structure of things. More could be done to simplify
things, but this is the least intrusive set of changes that seem
appropriate so late in the development cycle.

Release note (backwards-incompatible change): The experimental,
non-recommended stat-based rebalancing setting controlled by the
kv.allocator.stat_based_rebalancing.enabled and
kv.allocator.stat_rebalance_threshold cluster settings has been removed
and replaced by a new, better supported approach to load-based
rebalancing that can be controlled via the new
kv.allocator.load_based_rebalancing cluster setting. By default, leases
will be rebalanced within a cluster to achieve better QPS balance.
@a-robinson a-robinson force-pushed the store-rebalance2 branch 2 times, most recently from d8666a4 to 5bbc29e Compare September 6, 2018 18:49
@a-robinson
Copy link
Contributor Author

TFTR!

bors r+

craig bot pushed a commit that referenced this pull request Sep 6, 2018
28852: storage: make load-based replica rebalancing decisions at the store level r=a-robinson a=a-robinson

Built on top of #28340, which is where the first 3 commits are from. This is still somewhat incomplete, in that it's missing unit tests and I'm only just now running tpc-c 10k on it. Sending out now to start the discussion of whether to include it in 2.1, since we're obviously very late in the intended development cycle.

Co-authored-by: Alex Robinson <[email protected]>
@craig
Copy link
Contributor

craig bot commented Sep 6, 2018

Build succeeded

@craig craig bot merged commit 5bbc29e into cockroachdb:master Sep 6, 2018
a-robinson added a commit to a-robinson/cockroach that referenced this pull request Sep 7, 2018
I missed this when reworking the settings in cockroachdb#28852.

Fixes cockroachdb#29804
Fixes cockroachdb#29805

Release note: None
a-robinson added a commit to a-robinson/cockroach that referenced this pull request Sep 7, 2018
I missed this when reworking the settings in cockroachdb#28852.

Fixes cockroachdb#29804
Fixes cockroachdb#29805

Release note: None
craig bot pushed a commit that referenced this pull request Sep 7, 2018
29813: roachtest: Fix setting used in load-based rebalancing roachtests r=a-robinson a=a-robinson

I missed this when reworking the settings in #28852.

Fixes #29804
Fixes #29805

Release note: None

Co-authored-by: Alex Robinson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants