-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploymets scaled down very quickly despite low maxDisruption #143
Comments
Specifying the WPA will keep your minReplicas based on the RPM on the queue if secondsToProcessOneJob is specified. Basically this is the logic, code: workersBasedOnMessagesSent := int32(math.Ceil((secondsToProcessOneJob * messagesSentPerMinute) / 60))
if workersBasedOnMessagesSent > minWorkers {
return workersBasedOnMessagesSent
}
return minWorkers
Can you try this and let us know if this helps? |
@michael-careplanner I think the |
@justjkk Looking at our config resync-period is actually being set to 40, up from the default of 20. Agreed that having it set to a very low value doesn't make too much sense, but looks like that isn't the case in our config. @alok87 Thanks for your reply. I am planning to look at |
Possible to share the log the WPA for a particular queue with |
Not sure if this is a large enough snippet of logs but it does show the rapid scale down. |
Resync-period is not being respected. I see the control loop executing every 200ms, instead of 20second.
I think event handlers needs to use Made a change. Can you try this image: |
Tested using that and I am seeing some 40 seconds pauses (in line with our
Seems once it has decided to do a scale down it still loops in the controller without respecting the |
Ok I think since we are updating the status in WPA every reconciliation loop, hence it is creating new event everytime without respecting resync. Doing WPA status update only when changes happen and doing noop events can prevent this. |
@michael-careplanner can you confirm the 200ms is happening in the queues/wpa objects with else: if workerPodAutoScaler.Status.CurrentReplicas == currentWorkers &&
workerPodAutoScaler.Status.AvailableReplicas == availableWorkers &&
workerPodAutoScaler.Status.DesiredReplicas == desiredWorkers &&
workerPodAutoScaler.Status.CurrentMessages == queueMessages {
klog.V(4).Infof("%s/%s: WPA status is already up to date\n", namespace, name)
return
} else {
klog.V(4).Infof("%s/%s: Updating wpa status\n", namespace, name)
} |
Yes, I see lots of messages like this while it's doing the 200ms reconciliation loop: Followed by a single
|
Yes then it is confirmed, the update status is leading to very fast re-queue. Cant stop doing updates to the status since queue messages and others details are updated with every update, also the dashboards are there on top of these status. We may have to consider scale down cool-off after all: |
Hello @michael-careplanner Added
I have written test cases for it. But have not tested it in any Kubernetes cluster yet. Here is the image: |
Just tested this and hitting an issue where it returns
Looks like GetScaleOperation expects |
@michael-careplanner Fixed it here. New image with the above change: |
Tested that version, it now correctly infers the scale type, but it's still not respecting the cooldown - log:
I've had a good look over your PR but can't see where the bug is... |
I am guessing the status update for lastScaleTime is not reflecting in your WPA object.
This should get updated after every scale activity. It should start with nil. I am testing this one of our clusters. |
@michael-careplanner Hey! Issue was Thanks for bringing out this issue!! I saw the same problem in one of our high throughput queue. Here is the dashboard image after this change, the scale down becomes smooth 👍🏼 (this worker did not set If someone still needs the old behaviour, would need to set the scaleDownDelay to a lower value than default of 10mins. Below image has the fix, let me know how it works out for you!: |
@michael-careplanner This is good to release from our end. Working perfectly in our latest cluster. Did you get the chance to test it? |
@alok87 Yeah, this is working well on our end now. Thanks very much for all your work on this! |
We have several deployments scaled by WPA on fairly spiky queues. In order to limit how spiky our scale downs are we set maxDisruption to a low value.
However we still see scale downs happening very quicky. Looking at the logs, it seems WPA is taking scaling decisions several times a second. For example in once instance, despite having a maxDisruption set to 1% we saw a deployment scale from 53 replicas to 30 replicas in 5 seconds. Each time the deployment was decreased by 1 replica (so obeying the maxDisruption value set) but WPA made 26 individual scaling decisions in those 5 seconds.
I think a (possibly configurable) cooldown needs adding after a scaling decision, so that fast scale downs can be prevented by setting a low maxDisruption.
The text was updated successfully, but these errors were encountered: