Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollout operator does not respect grafana.com/min-time-between-zones-downscale label #198

Open
verejoel opened this issue Feb 15, 2025 · 0 comments

Comments

@verejoel
Copy link

verejoel commented Feb 15, 2025

Perhaps I have misunderstood the meaning of this label. My expectation is that if I set it to 5m, the rollout operator will wait at least 5 minutes from the point that the previous stateful set in the rollout group is ready, to begin rolling out the next statefulset.

I have three statefulsets defined, with the following labels/annotations:

# Zone A
  annotations:
    rollout-max-unavailable: "2"
  labels:
    controller.receive.thanos.io: thanos-hashring-controller
    controller.receive.thanos.io/hashring: soft-tenants
    grafana.com/min-time-between-zones-downscale: 5m
    name: thanos-ingester-zone-a
    rollout-group: ingester

# Zone B
  annotations:
    grafana.com/rollout-downscale-leader: thanos-ingester-zone-a
    rollout-max-unavailable: "2"
  labels:
    controller.receive.thanos.io: thanos-hashring-controller
    controller.receive.thanos.io/hashring: soft-tenants
    grafana.com/min-time-between-zones-downscale: 5m
    name: thanos-ingester-zone-b
    rollout-group: ingester

# Zone C
  annotations:
    grafana.com/rollout-downscale-leader: thanos-ingester-zone-b
    rollout-max-unavailable: "2"
  labels:
    controller.receive.thanos.io: thanos-hashring-controller
    controller.receive.thanos.io/hashring: soft-tenants
    grafana.com/min-time-between-zones-downscale: 5m
    name: thanos-ingester-zone-c
    rollout-group: ingester

However, when I trigger a rollout with this config, each zone is rolled out immediately once the leader is ready:

❯ kt rollout restart sts --selector=app.kubernetes.io/component=ingester

# 47 seconds later - zone A is fully ready, zone B is immediately torn down
❯ kt get pods | grep ingester
thanos-ingester-zone-a-0                      1/1     Running       0          47s
thanos-ingester-zone-a-1                      1/1     Running       0          47s
thanos-ingester-zone-b-0                      1/1     Terminating   0          11m
thanos-ingester-zone-b-1                      1/1     Terminating   0          11m
thanos-ingester-zone-c-0                      1/1     Running       0          11m
thanos-ingester-zone-c-1                      1/1     Running       0          11m

# same for zone C
❯ kt get pods | grep ingester
thanos-ingester-zone-a-0                      1/1     Running       0          81s
thanos-ingester-zone-a-1                      1/1     Running       0          81s
thanos-ingester-zone-b-0                      1/1     Running       0          26s
thanos-ingester-zone-b-1                      1/1     Running       0          26s
thanos-ingester-zone-c-0                      0/1     Running       0          1s
thanos-ingester-zone-c-1                      1/1     Terminating   0          11m

Is this a bug? A wrong expectation? Or am I misconfiguring something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant