Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscaling controller should not calculate desired replicas including out-of-scaleTargetRef RunnerDeployment #330

Closed
tapih opened this issue Feb 18, 2021 · 2 comments · Fixed by #355

Comments

@tapih
Copy link
Contributor

tapih commented Feb 18, 2021

What happened:

I tried using HorizontalRunnerAutoscaler with multiple RunnerDeployment existing in the same namespace,
and found that the autoscaling controller does not work properly.

  1. Install actions-runner-controller following the getting started
  2. Create 2 RunnerDeployment, named example-runner-deployment and example-runner-deployment, with 2 replicas for each
  3. Create HorizontalRunnerAutoscaler with maxReplicas: 4, scaleUpThreshold: '0.6' and scaleDownFactor: '1' for example-runner-deployment

Then, a reconciliation was triggered right after HorizontalRunnerAutoscaler was created and the autoscaling controller created 2 new Runners under example-runner-deployment, even though there was no Runner with the Busy status.

So, there were 4 Runners for example-runner-deployment and 2 Runners for example-runner-deployment-2 in the end.

What I expected to happen:

No Runners are created for example-runner-deployment.

I read this part of the code, and I'm guessing that
the autoscaling controller calculates the desired replicas for scaleTargetRef including all the runner existing in the same namespace.
In my case, the desired replicas is calculated like example-runner-deployment + example-runner-deployment-2 = 4 as
the log of the controller says that computed_replicas_desired is 4.

2021-02-18T05:50:25.010Z        DEBUG   controllers.HorizontalRunnerAutoscaler  Calculated desired replicas     {"computed_replicas_desired": 4, "spec_replicas_min": 1, "spec_replicas_max": 4, "current_replicas": 4, "num_runners": 6, "num_runners_busy": 0}

The controller should scale out Runners with filtering the targets using labels, but now it does not seem to manage Runners using label selectors like k8s ReplicaSet.
IMHO, we should fix the replicaset controller to manage Runners with label selectors first, and then fix the autoscaling controller to filter Runners with labels.

What do you think about it? If you think it's reasonable, I'm willing to fix it as I wrote.

Thank you!

Environment:

OS: ubuntu 20.04
k8s cluster: v1.19.7
k8s bootstrap: kind v0.10.0
actions-runner-controller: v0.16.1
actions-runner: v2.276.1

Manifests:

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: example-runner-deployment
  namespace: actions-runner-system
spec:
  replicas: 2
  template:
    spec:
      organization: <ORGNAME>
      labels:
        - custom-runner1
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: example-runner-deployment-2
  namespace: actions-runner-system
spec:
  replicas: 2
  template:
    spec:
      organization: <ORGNAME>
      labels:
        - custom-runner2
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
  name: example-runner-deployment-autoscaler
  namespace: actions-runner-system
spec:
  scaleTargetRef:
    name: example-runner-deployment
  minReplicas: 1
  maxReplicas: 4
  metrics:
  - type: PercentageRunnersBusy
    scaleUpThreshold: '0.6'
    scaleDownThreshold: '0.1'
    scaleUpFactor: '1.5'
    scaleDownFactor: '1'
    repositoryNames:
    - <ORGNAME>/<REPONAME>
@mumoshu
Copy link
Collaborator

mumoshu commented Feb 18, 2021

@tapih Hey! Thanks for the report and the proposal. Yeah, that does make sense. Recently I had a chance to read https://github.com/summerwind/actions-runner-controller/blob/be133228166274c1b03716e511c1051199c760a8/controllers/autoscaling.go#L260-L262 as well and thought it did look problematic but I had no confidence as no one has mentioned this issue so far!

If you think it's reasonable, I'm willing to fix it as I wrote.

That would be very much appreciated ☺️

@tapih
Copy link
Contributor Author

tapih commented Feb 18, 2021

@mumoshu
Sure. I'm gonna start working on it soon. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants