Introduce adaptive tasklist scaler #6506

Shaddoll · 2024-11-18T22:44:26Z

This PR introduces a new component AdaptiveScaler to Matching's task List Manager. This component only runs in the root partition or a Normal task list and is turned on only if the following 2 dynamic config properties are set true:

MatchingEnableAdaptiveScaler
MatchingEnableGetNumberOfPartitionsFromCache

This component monitors the add task QPS of the root partition of a task list and decides whether the task list need more partitions or the number of partitions need to be decreased. It's based on the assumption that the add task QPS is evenly distributed among all task list partitions.

When the adaptive scaler decides to increase the number of partitions, it increases the number of read partitions and write partitions at the same time. When it decreases the number of partitions, it decreases the number of write partitions first and the number of read partitions is decreased only after all the backlog of the read partitions are drained.

The component is configured by the following 5 dynamic config properties:

MatchingPartitionUpscaleRPS: default to 200
MatchingPartitionDownscaleFactor: default to 0.75
MatchingPartitionUpscaleSustainedDuration: default to 1 min
MatchingPartitionDownscaleSustainedDuration: default to 2 min
MatchingAdaptiveScalerUpdateInterval: default to 15 seconds

MatchingAdaptiveScalerUpdateInterval configures how often it checks the QPS.

MatchingPartitionUpscaleSustainedDuration determines the minimum duration a high load must be sustained on matching task list to trigger the operation to increase the number of partitions.

MatchingPartitionDownscaleSustainedDuration determines the minimum duration a low load must be sustained on matching task list to trigger the operation to decrease the number of partitions.

High load definition:

total QPS > MatchingPartitionUpscaleRPS * Number of Write Partitions

Low load definition:

total QPS < MatchingPartitionUpscaleRPS * (Number of Write Partitions - 1) * MatchingPartitionDownscaleFactor

Other minor changes:

Disable manual update of TaskListPartitionConfig if adaptive scaler is turned on
Do not accept add task requests if the partition is removed
If current partition config is nil and we want to update the number of partitions to 1, it should be no-op.

service/matching/tasklist/task_list_manager.go

service/matching/tasklist/adaptive_scaler.go

natemort · 2024-11-21T22:24:14Z

service/matching/tasklist/adaptive_scaler.go

+			a.underLoad = true
+			a.underLoadStartTime = a.timeSource.Now()
+		} else if a.timeSource.Now().Sub(a.underLoadStartTime) > a.config.PartitionDownscaleSustainedDuration() {
+			numWritePartitions = getNumberOfPartitions(partitionConfig.NumWritePartitions, qps, upscaleThreshold) // NOTE: this has to be upscaleThreshold


I think this approach can make some counterintuitive scaling decisions. Any time (upscaleThreshold - downscaleThreshold) * numWritePartitions > (2 * upscaleThreshold) we're only able to scale down by multiple partitions at once.

For example, if we have thresholds of (500, 1000) with 10,000 global traffic then we would end up with 10 partitions and we would never scale down until global traffic drops below 5,000, at which point we dramatically scale from 10 partitions to 5. If traffic goes back to 5,001 then we'd scale up to 6 partitions, but we'd only scale down again if traffic drops below 3,000.

This approach also can end up in kind of strange scenarios where we're continually underLoad but we never change the number of partitions. If we have thresholds of (500, 600) with 1800 global traffic then we would have 3 partitions. When traffic drops to 1499 we're considered underLoad and would try to update numWritePartitions but it won't actually change the value until traffic drops below 1200.

After thinking about this, I think to avoid fluctuation, the thresholds need to satisfy this inequality

downscale threshold <= upscale threshold * N / (N + 1) (for all N)

Let's take your (500, 600) thresholds as an example, if the global qps is 1801, then we would have 4 partitions. But if we have 4 partitions, the load of each partition will be around 450, which is less than 500, and we will need to downscale. When the downscale operation is triggered, we have to recalculate the number of partitions based on the qps of root partition. The estimation is around 450, so it could be 451 or 449. If the estimation is larger than 450, then we still have 4 partitions, so no change. But the estimation is 449, then downscale will be triggered.

common/dynamicconfig/constants.go

service/matching/tasklist/adaptive_scaler.go

taylanisikdemir · 2024-11-22T20:58:15Z

service/matching/tasklist/adaptive_scaler.go

+			a.overLoad = false
+		}
+	} else {
+		a.overLoad = false


This implementation requires consecutive overloaded windows to scale up. If qps drops momentarily for some reason (rate limit, qps tracker calculation issue etc.) then we will not be able to scale up.
Have you considered doing this calculation more often (every 1s instead of every 15s) and generating a series of overloaded/not-overloaded results. After a minute you would have 60 data points representing whether it was overloaded for that particular second. Then determine whether scale up is needed if it's overloaded more than half of the time?
I just thought of this idea so it may not be ideal solution but something to address consecutiveness requirement would be needed IMO

we can discuss this offline, but I think with downscaleFactor, we can handle fluctuation. The default factor is 0.75, which means the number of partitions won't change unless the traffic drop by 25%.

I think we need matching simulator to validate different options of scale formula. I assume you will update simulation next and iterate on this. If so this looks like a good start. Can you confirm?

It doesn't fit the simulation framework because the output of simulation tests assume that the number of partitions don't change. I can run bench tests instead.

Simulation framework can be enhanced to support this. Feedback loop is faster in simulations and also more repeatable than benchmarking in dev env so let's invest in this.

Shaddoll force-pushed the partition branch 16 times, most recently from 1c60d92 to a1fe45c Compare November 20, 2024 23:21

Shaddoll marked this pull request as ready for review November 21, 2024 00:04

Shaddoll requested review from neil-xie, davidporter-id-au, Groxx, shijiesheng, jakobht, 3vilhamster, sankari165, dkrotx, taylanisikdemir and demirkayaender as code owners November 21, 2024 00:04

natemort reviewed Nov 21, 2024

View reviewed changes

Shaddoll force-pushed the partition branch 2 times, most recently from f2c4533 to ffbd993 Compare November 22, 2024 03:46

Shaddoll force-pushed the partition branch 2 times, most recently from 5af33e8 to ef9b3c4 Compare November 22, 2024 18:03

taylanisikdemir reviewed Nov 22, 2024

View reviewed changes

Shaddoll force-pushed the partition branch from ef9b3c4 to c1ec292 Compare November 22, 2024 21:44

Shaddoll added 3 commits November 25, 2024 19:12

Introduce adaptive tasklist scaler

15c167c

refactor code

f7379b8

Change downscale detection

e39d781

Shaddoll force-pushed the partition branch from c1ec292 to e39d781 Compare November 25, 2024 19:13

taylanisikdemir approved these changes Nov 25, 2024

View reviewed changes

Shaddoll merged commit 5b2be37 into cadence-workflow:master Nov 25, 2024
17 checks passed

Shaddoll deleted the partition branch November 25, 2024 22:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce adaptive tasklist scaler #6506

Introduce adaptive tasklist scaler #6506

Shaddoll commented Nov 18, 2024 •

edited

Loading

natemort Nov 21, 2024

Shaddoll Nov 22, 2024

taylanisikdemir Nov 22, 2024

Shaddoll Nov 22, 2024

taylanisikdemir Nov 24, 2024

Shaddoll Nov 25, 2024

taylanisikdemir Nov 25, 2024

Introduce adaptive tasklist scaler #6506

Introduce adaptive tasklist scaler #6506

Conversation

Shaddoll commented Nov 18, 2024 • edited Loading

High load definition:

Low load definition:

natemort Nov 21, 2024

Choose a reason for hiding this comment

Shaddoll Nov 22, 2024

Choose a reason for hiding this comment

taylanisikdemir Nov 22, 2024

Choose a reason for hiding this comment

Shaddoll Nov 22, 2024

Choose a reason for hiding this comment

taylanisikdemir Nov 24, 2024

Choose a reason for hiding this comment

Shaddoll Nov 25, 2024

Choose a reason for hiding this comment

taylanisikdemir Nov 25, 2024

Choose a reason for hiding this comment

Shaddoll commented Nov 18, 2024 •

edited

Loading