Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add recording rules to calculate Cortex scaling #278

Merged
merged 1 commit into from
Mar 19, 2021
Merged

Conversation

tomwilkie
Copy link
Contributor

@tomwilkie tomwilkie commented Mar 18, 2021

Signed-off-by: Tom Wilkie [email protected]

What this PR does:

Extracts the queries from the scaling dashboard so we can do:

sort_desc(cluster_namespace_deployment_reason:required_replicas:count{namespace=~"cortex.*"} ​> ignoring(reason) group_left cluster_namespace_deployment:actual_replicas:count)

And eventually alert on this.

@tomwilkie tomwilkie marked this pull request as ready for review March 18, 2021 19:42
@tomwilkie tomwilkie requested a review from a team as a code owner March 18, 2021 19:42
quantile_over_time(0.99,
sum by (cluster, namespace) (
cluster_namespace_job:cortex_distributor_received_samples:rate5m
)[24h:]
Copy link
Member

@owen-d owen-d Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why did you chose 24h as the period here? I would have chosen a smaller one. I guess the longer the interval the safer we are 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wanted to make them all consistent (the CPU and memory are also 24hr).

This should also help prevent them from flapping, I guess at the cost of them not responding to change quite as quickly.

@tomwilkie tomwilkie force-pushed the scaling-rules branch 4 times, most recently from 564a959 to 5db9c40 Compare March 19, 2021 12:55
- Update dashboard so it only shows under provisioned services and why
- Add sizing rules based on limits.
- Add some docs to the dashboard.

Signed-off-by: Tom Wilkie <[email protected]>
@tomwilkie tomwilkie merged commit 9b04c90 into main Mar 19, 2021
@tomwilkie tomwilkie deleted the scaling-rules branch March 19, 2021 14:04
simonswine pushed a commit to grafana/mimir that referenced this pull request Oct 18, 2021
Add recording rules to calculate Cortex scaling
simonswine pushed a commit to grafana/mimir that referenced this pull request Dec 20, 2021
Add recording rules to calculate Cortex scaling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants