storage: make load-based replica rebalancing decisions at the store level #28852

a-robinson · 2018-08-20T18:46:12Z

Built on top of #28340, which is where the first 3 commits are from. This is still somewhat incomplete, in that it's missing unit tests and I'm only just now running tpc-c 10k on it. Sending out now to start the discussion of whether to include it in 2.1, since we're obviously very late in the intended development cycle.

cockroach-teamcity · 2018-08-20T18:46:22Z

This change is

a-robinson · 2018-08-23T03:21:05Z

I take back what I said about this not working on tpc-c. As mentioned on slack, the problem was that running with --vmodule enabled destroys performance even when it doesn't cause many more logs to be written.

Results of a one hour tpc-c 10k run on 30 nodes:

$ roachprod ssh $CLUSTERNAME:31 "ulimit -n 10000 && ./workload.LATEST run tpcc --warehouses=10000 --ramp=600s --duration=3600s --tolerate-errors {pgurl:1-30}"
[...]
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 3600.1s        4         766336          212.9    301.1    268.4    536.9    872.4 103079.2  delivery
 3600.1s        4        7632043         2119.9    251.5    209.7    469.8    771.8 103079.2  newOrder
 3600.1s        4         766185          212.8     29.1     22.0     75.5    104.9   2550.1  orderStatus
 3600.1s        4        7662205         2128.3    194.4    151.0    402.7    738.2 103079.2  payment
 3600.1s        4         766331          212.9    108.7     96.5    184.5    268.4 103079.2  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 3600.1s        4       17593100         4886.8    212.9    176.2    436.2    738.2 103079.2
Audit check 9.2.1.7: PASS
Audit check 9.2.2.5.1: PASS
Audit check 9.2.2.5.2: PASS
Audit check 9.2.2.5.3: PASS
Audit check 9.2.2.5.4: PASS
Audit check 9.2.2.5.5: PASS
Audit check 9.2.2.5.6: PASS

_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
 3600.1s   127219.4  98.9%    251.5    209.7    385.9    469.8    771.8 103079.2

The pMax is not ideal. I'm not sure what's the cause of the outlier(s). But otherwise this is a good sign that more tests/polish are justified.

nvanbenschoten

but mostly just 👨‍🔬 🐶 on the StoreRebalancer changes. @BramGruneir do you mind taking a look at that?

Reviewed 6 of 6 files at r4, 1 of 1 files at r5, 1 of 1 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

pkg/storage/allocator.go, line 383 at r4 (raw file):

		existingReplicas: len(existing),
		aliveStores:      aliveStoreCount,
		throttledStores:  throttledStoreCount,

nit: always 0, right?

a-robinson

Thanks @nvanbenschoten. Waiting on @BramGruneir isn't ideal since he's out this week, but I'll at least wait until I have metrics/debugging hooked up before merging, and will definitely wait for his review before cherrypicking.

Reviewable status: complete! 1 of 0 LGTMs obtained

pkg/storage/allocator.go, line 383 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: always 0, right?

True, although I'd prefer to leave as is rather than assume it's always 0 in case the condition above ever gets removed.

a-robinson · 2018-09-05T19:46:21Z

Plans for improving RelocateRange outlined in #29130

BramGruneir

Gave this another full read through.

LGTM

Reviewed 3 of 3 files at r1, 9 of 9 files at r2, 3 of 3 files at r3, 1 of 6 files at r4, 13 of 13 files at r7.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

Follow-up to cockroachdb#28340, which did this for just leases. Fixes cockroachdb#17979 Release note (performance improvement): Range replicas will be automatically rebalanced throughout the cluster to even out the amount of QPS being handled by each node.

This leaves properly cleaning up the code for later, but ensures that the existing cluster setting will enable store-level rebalancing rather than the old experimental write/disk-based rebalancing. Release note: None

It's identical to the test for load-based lease rebalancing, just with more than 3 nodes such that replicas must be rebalanced in addition to leases in order for load to be properly spread across all nodes. Release note: None

This cleans up all the old code, settings, and tests without massively overhauling the structure of things. More could be done to simplify things, but this is the least intrusive set of changes that seem appropriate so late in the development cycle. Release note (backwards-incompatible change): The experimental, non-recommended stat-based rebalancing setting controlled by the kv.allocator.stat_based_rebalancing.enabled and kv.allocator.stat_rebalance_threshold cluster settings has been removed and replaced by a new, better supported approach to load-based rebalancing that can be controlled via the new kv.allocator.load_based_rebalancing cluster setting. By default, leases will be rebalanced within a cluster to achieve better QPS balance.

Release note: None

a-robinson · 2018-09-06T18:52:09Z

TFTR!

bors r+

28852: storage: make load-based replica rebalancing decisions at the store level r=a-robinson a=a-robinson Built on top of #28340, which is where the first 3 commits are from. This is still somewhat incomplete, in that it's missing unit tests and I'm only just now running tpc-c 10k on it. Sending out now to start the discussion of whether to include it in 2.1, since we're obviously very late in the intended development cycle. Co-authored-by: Alex Robinson <[email protected]>

craig · 2018-09-06T19:18:09Z

Build succeeded

GitHub CI (Cockroach)

I missed this when reworking the settings in cockroachdb#28852. Fixes cockroachdb#29804 Fixes cockroachdb#29805 Release note: None

29813: roachtest: Fix setting used in load-based rebalancing roachtests r=a-robinson a=a-robinson I missed this when reworking the settings in #28852. Fixes #29804 Fixes #29805 Release note: None Co-authored-by: Alex Robinson <[email protected]>

a-robinson requested review from BramGruneir, nvanbenschoten and a team August 20, 2018 18:46

This was referenced Aug 20, 2018

storage: make lease rebalancing decisions at the store level #28340

Merged

[prototype] storage: Make rebalance decisions at store-level #26608

Closed

a-robinson force-pushed the store-rebalance2 branch from f6900c0 to 0f13108 Compare August 20, 2018 20:32

nvanbenschoten approved these changes Aug 27, 2018

View reviewed changes

a-robinson commented Aug 27, 2018

View reviewed changes

a-robinson force-pushed the store-rebalance2 branch 4 times, most recently from eacd889 to c2a2041 Compare August 31, 2018 20:01

a-robinson requested a review from a team September 5, 2018 16:56

a-robinson mentioned this pull request Sep 5, 2018

storage: RelocateRange doesn't handle edge cases properly #29130

Closed

BramGruneir approved these changes Sep 6, 2018

View reviewed changes

a-robinson added 4 commits September 6, 2018 13:47

storage: Prevent usage of old stats-based rebalancing logic

8e3dad8

This leaves properly cleaning up the code for later, but ensures that the existing cluster setting will enable store-level rebalancing rather than the old experimental write/disk-based rebalancing. Release note: None

roachtest: Add test for load-based replica rebalancing

ef56cab

It's identical to the test for load-based lease rebalancing, just with more than 3 nodes such that replicas must be rebalanced in addition to leases in order for load to be properly spread across all nodes. Release note: None

a-robinson force-pushed the store-rebalance2 branch from 63ace74 to 2e5c856 Compare September 6, 2018 18:47

storage: Add metrics tracking load-based rebalance operations

5bbc29e

Release note: None

a-robinson force-pushed the store-rebalance2 branch 2 times, most recently from d8666a4 to 5bbc29e Compare September 6, 2018 18:49

a-robinson mentioned this pull request Sep 6, 2018

backport-2.1: storage: make load-based replica rebalancing decisions at the store level #29663

Merged

craig bot merged commit 5bbc29e into cockroachdb:master Sep 6, 2018

a-robinson mentioned this pull request Sep 7, 2018

roachtest: Fix setting used in load-based rebalancing roachtests #29813

Merged

a-robinson mentioned this pull request Sep 7, 2018

backport-2.1: roachtest: Fix setting used in load-based rebalancing roachtests #29814

Merged

This was referenced Sep 10, 2018

storage: re-enable stats-based rebalancing #17979

Closed

qa: Load-based rebalancing of leases and replicas #30007

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: make load-based replica rebalancing decisions at the store level #28852

storage: make load-based replica rebalancing decisions at the store level #28852

a-robinson commented Aug 20, 2018

cockroach-teamcity commented Aug 20, 2018

a-robinson commented Aug 23, 2018

nvanbenschoten left a comment

a-robinson left a comment

a-robinson commented Sep 5, 2018

BramGruneir left a comment

a-robinson commented Sep 6, 2018

craig bot commented Sep 6, 2018

storage: make load-based replica rebalancing decisions at the store level #28852

storage: make load-based replica rebalancing decisions at the store level #28852

Conversation

a-robinson commented Aug 20, 2018

cockroach-teamcity commented Aug 20, 2018

a-robinson commented Aug 23, 2018

nvanbenschoten left a comment

Choose a reason for hiding this comment

a-robinson left a comment

Choose a reason for hiding this comment

a-robinson commented Sep 5, 2018

BramGruneir left a comment

Choose a reason for hiding this comment

a-robinson commented Sep 6, 2018

craig bot commented Sep 6, 2018

Build succeeded