Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reconcilePeriod is not honored when installing HorizontalPodAutoscalers #3708

Closed
faust64 opened this issue Aug 10, 2020 · 7 comments
Closed
Assignees
Labels
language/ansible Issue is related to an Ansible operator project triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@faust64
Copy link

faust64 commented Aug 10, 2020

Bug Report

What did you do?

I'm trying to have some Ansible operator install an HPA alongside my deployments.

Overall it works fine. Although as I've been looking at metrics exported by my operator, I realized any playbook I have, where those HPA configuration is enabled, would be running on a loop - as if some object got changed, either manually or during the last playbook run.

I could confirm that when HPAs configuration is disabled, the same playbooks would only apply once every reconcilePeriod, as defined in my watches.yaml.

When managing HPAs, my operator would create the following object, using the k8s plugin:

- apiVersion: autoscaling/v1
  kind: HorizontalPodAutoscaler
  metadata:
    name: cpu-autoscale-ssp-demo
    namespace: localrelease
  spec:
    maxReplicas: 4
    minReplicas: 1
    scaleTargetRef:
      apiVersion: apps/v1
      kind: Deployment
      name: ssp-demo
    targetCPUUtilizationPercentage: 75

The operator would not update that object, unless its spec actually needs to be changed - so once done with the first run, creating HPAs, I would see the operator skipping any step creating or patching HPAs.
In the end, the operator did not apply anything, yet that playbook is always running.

As far as I understand, it has to do with HPAs being regularly updated by a controller. This somehow triggers a new run of Ansible, while in that specific case, this really isn't necessary.

If I dump that object, I can see, among others, who last updated it:

$ kubectl get -o yaml -n localrelease hpa cpu-autoscale-ssp-demo
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2020-08-02T13:26:23Z","reason":"ReadyForNewScale","message":"recommended
      size matches current size"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2020-08-02T13:27:41Z","reason":"ValidMetricFound","message":"the
      HPA was able to successfully calculate a replica count from cpu resource utilization
      (percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2020-07-17T15:07:36Z","reason":"DesiredWithinRange","message":"the
      desired count is within the acceptable range"}]'
    autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":3,"currentAverageValue":"6m"}}]'
  creationTimestamp: "2020-07-17T15:06:49Z"
  managedFields:
  - apiVersion: autoscaling/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .: {}
          k:{"uid":"121293cd-c2b0-432b-8efe-2abb29cd30d2"}:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:uid: {}
      f:spec:
        f:maxReplicas: {}
        f:minReplicas: {}
        f:scaleTargetRef:
          f:apiVersion: {}
          f:kind: {}
          f:name: {}
        f:targetCPUUtilizationPercentage: {}
    manager: Swagger-Codegen
    operation: Update
    time: "2020-07-17T15:06:49Z"
  - apiVersion: autoscaling/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:autoscaling.alpha.kubernetes.io/conditions: {}
          f:autoscaling.alpha.kubernetes.io/current-metrics: {}
      f:status:
        f:currentCPUUtilizationPercentage: {}
        f:currentReplicas: {}
        f:desiredReplicas: {}
    manager: kube-controller-manager
    operation: Update
    time: "2020-08-10T10:02:41Z"
  name: cpu-autoscale-ssp-demo
  namespace: localrelease
  ownerReferences:
  - apiVersion: wopla.io/v1beta1
    kind: Directory
    name: demo
    uid: 121293cd-c2b0-432b-8efe-2abb29cd30d2
  resourceVersion: "38768057"
  selfLink: /apis/autoscaling/v1/namespaces/localrelease/horizontalpodautoscalers/cpu-autoscale-ssp-demo
  uid: 3a8354d7-d53f-4064-98cd-c94e41681330
spec:
  maxReplicas: 4
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ssp-demo
  targetCPUUtilizationPercentage: 75
status:
  currentCPUUtilizationPercentage: 3
  currentReplicas: 1
  desiredReplicas: 1

The managedFields array mentions, in addition to the Swagger-Codegen edit, one made by some kube-controller-manager, whose time frequently changes matching current date.

What did you expect to see?

I would expect the reconcilePeriod defined in my watches.yaml to be observed, regardless of my operator managing HPAs or not.

What did you see instead? Under which circumstances?

When installing HPAs, reconcilePeriod is "ignored", as some controller constantly updates HPA - child resources of objets being watched by my operator.

Environment

  • operator-sdk version: 0.19.2

  • go version: (whichever ships in that image)

  • Kubernetes version information:

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:43:34Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

  • Kubernetes cluster kind:

vanilla / kube-spray, calico, containerd

  • Are you writing your operator in ansible, helm, or go?

Ansible

@jberkhahn jberkhahn self-assigned this Aug 10, 2020
@jberkhahn
Copy link
Contributor

Hey there, I've taken a look at this and been able to reproduce the behavior. A couple points:

  1. The two things seen updating your HPA are both the ansible operator. The swagger-codegen label is an artifact of the python lib used to generate the kube yaml in the ansible operator.
  2. reconcilePeriod does a lot less than you think it does. It doesn't guarantee a minimum time between reconciliations, it's a max time between reconciliations if no watch events are received. It's primary purpose is if there is some state of the objects that isn't contained within the main etcd store, like if the operator is interfacing with an external system. It also doesn't prevent reconciliations triggered by changes in dependent resources, as that would be watchDependentResources. I agree the docs on this could be a bit more clear, we'll try to make them better.

Hopefully this clears it, but I don't think any of the behavior you're seeing is unexpected.

@estroz estroz added the triage/needs-information Indicates an issue needs more information in order to work on it. label Aug 17, 2020
@estroz estroz added this to the Backlog milestone Aug 17, 2020
@faust64
Copy link
Author

faust64 commented Aug 18, 2020

The two things seen updating your HPA are both the ansible operator

I think you are wrong here.

I'm not questioning the presence of the swagger-codegen. On the other hand I'm noting that kube-controller-manager seems to be updated whenever the HPA status gets updated, with no relation to Ansible loops. Quite obviously by the kubernetes controller manager.

reconcilePeriod does a lot less than you think it does... max time between reconciliations if no watch events are received

Doc is clear enough.

I don't think any of the behavior you're seeing is unexpected.

Sure. We can argue there's no bug in the implementation, if we stick to what the doc says.

My point remains, the Ansible operator should not obsess over HPAs.
We can see the kube-controller-manager authored those last changes, checking the managedFields. Thus there should be a way to prevent Ansible from re-applying a playbook.

The watchDependentResources is useless. Until I can filter on object kinds, at least.
I have over 30 CRDs (kind, for about 60 test CRs my lab), playbooks running 30+ minutes, on 3-12h loops ( https://gitlab.com/synacksynack/opsperator/docker-operator/ ).
I definitely want my playbooks to start right away, when the operator noticed something was changed. I don't want to "check tomorrow".

On the other hand, I can't have all my playbooks constantly running. I've been deploying Tekton, KubeVirt, Prometheus, ... using StatefulSets, DaemonSets, Deployments, DeploymentConfigs, Ingress, Routes, ... never seen this, until dealing with HPAs.

I can't really ignore this. CPU usage is a mess. There's no global max-workers count. Setting CPU limits on the operator, eventually you'ld see the k8s plugin timing out, and your playbooks randomly failing.
If there's any chance to contain CPU usage, avoid running playbooks when there's no reason for it: that's something we should go for.

@asmacdo asmacdo added the language/ansible Issue is related to an Ansible operator project label Aug 24, 2020
@kensipe kensipe removed this from the Backlog milestone Aug 24, 2020
@asmacdo
Copy link
Member

asmacdo commented Aug 24, 2020

Have you tried the blacklist feature? https://v0-19-x.sdk.operatorframework.io/docs/ansible/reference/watches/

@joelanford
Copy link
Member

@faust64 The ansible operator sets up watches on all dependent resources that are created by the operator. This is intentional so that when those objects change, we can trigger a re-reconcile. If I'm understanding this correctly, what I think you might want is a way to tell the ansible operator, "When an event arrives that was triggered by a change made by kube-controller-manager, ignore that event." Or maybe "When an event arrives where only specific fields changed, ignore that event".

Am I understanding that correctly?

Can you tell what change the kube-controller-manager is making to the HPA that is causing an event to occur?

@faust64
Copy link
Author

faust64 commented Aug 24, 2020

@joelanford this is correct.
Not sure of the details, I'ld assume that's just the HPA controller updating the object status (the .status.currentCPUUtilizationPercentage, especially).
Though there's no easy way to see the content of the patch, I know the .spec part still matches my original configuration.

I think the blacklist feature @asmacdo mentioned might just do it. I'll try it out, thanks!

@faust64
Copy link
Author

faust64 commented Aug 25, 2020

I don't think the blacklist thing works .... Or I'm doing it wrong.

After patching my watches.yaml, adding the following block to all my CRs:

- blacklist:
  - group: autoscaling.k8s.io
    kind: HorizontalPodAutoscaler
    version: autoscaling/v1
  - group: autoscaling.k8s.io
    kind: HorizontalPodAutoscaler
    version: autoscaling/v2beta1
  - group: autoscaling.k8s.io
    kind: VerticalPodAutoscaler
    version: autoscaling/v1beta1
  - group: autoscaling.k8s.io
    kind: VerticalPodAutoscaler
    version: autoscaling/v1
  - group: policy
    version: policy/v1beta1
    kind: PodDisruptionBudget
[...]

Then re-enabling HPAs configurations by my operator, I'm back with those playbooks constantly restarting:

 ls -ltr */stdout
-rw------- 1 ansible-operator root 518998 Aug 25 06:55 3134160998047589272/stdout
-rw------- 1 ansible-operator root 518786 Aug 25 07:06 latest/stdout
-rw------- 1 ansible-operator root 518786 Aug 25 07:06 5104404970293128812/stdout
-rw------- 1 ansible-operator root 193234 Aug 25 07:08 1768783381183627017/stdout
$ tail -3 latest/stdout
PLAY RECAP *********************************************************************
localhost                  : ok=129  changed=0    unreachable=0    failed=0    skipped=325  rescued=0    ignored=0  

@faust64
Copy link
Author

faust64 commented Aug 29, 2020

. allright, works fine with:

- blacklist:
  - group: autoscaling
    kind: HorizontalPodAutoscaler
    version: v1
  - group: autoscaling
    kind: HorizontalPodAutoscaler
    version: v2beta1
  - group: autoscaling.k8s.io
    kind: VerticalPodAutoscaler
    version: v1beta1
  - group: autoscaling.k8s.io
    kind: VerticalPodAutoscaler
    version: v1
  - group: policy
    kind: PodDisruptionBudget
    version: v1beta1

my bad.

It's not perfect, as manual deletions won't trigger a new reconcile. But in may case, that's good enough.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/ansible Issue is related to an Ansible operator project triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

6 participants