Add automaxprocs. #4301

robholland · 2023-05-09T10:07:08Z

This will automate the setting of GOMAXPROCS in Kubernetes/docker environments where it is not already set as part of the deployment.

What changed?

automaxprocs library was added to set GOMAXPROCS to match the CPU limits set on a container. This is a no-op if GOMAXPROCS environment variable is already set, or if not running in a container.

Why?

Setting GOMAXPROCS to match resource limits (rather than the total core count of the node) allows Go to more efficiently use the available cores and reduces CPU throttling (eliminating it entirely if limits are set to an integer number of cores).

This issue was highlighted during benchmarking but probably effects a large number of real world Kubernetes deployments where CPU limits are set but GOMAXPROCS is not.

How did you test it?

Potential risks

Is hotfix candidate?

No.

This will automate the setting of GOMAXPROCS in Kubernetes/docker environments where it is not already set as part of the deployment. Setting GOMAXPROCS to match resource limits (rather than the total core count of the node) allows Go to more efficiently use the available cores and reduces CPU throttling (eliminating it if limits are set to an integer number of cores). This issue was highlighted during benchmarking but probably effects a large number of real world Kubernetes deployments where CPU limits are set but GOMAXPROCS is not.

robholland · 2023-05-09T10:36:58Z

Example of correcting GOMAXPROCS on an 8-core node to match the CPU limit of 1 core. See that throttling is gone and that CPU usage in general has improved (reduced).

Note: This is was not done using this PR as docker builds are not automatically published for PRs. This was done by manually setting GOMAXPROCS, but that is same effect as adding this library.

MichaelSnowden

This issue was highlighted during benchmarking

Do you have any benchmark results to compare before and after this change? I see the throttling disappear from the screenshot, but it looks like this change increased average latency in the example that the library has in their docs. It'd be good to know how it would affect our latency profile.

robholland · 2023-05-09T14:49:26Z

I don't have the cluster now but I can recreate, which metrics would you like comparing before/after?

MichaelSnowden · 2023-05-09T14:50:49Z

I don't have the cluster now but I can recreate, which metrics would you like comparing before/after?

I think task processing p50 and p99 is good enough

robholland · 2023-05-09T14:52:07Z

task_latency_processing ?

robholland · 2023-05-09T15:42:10Z

robholland · 2023-05-09T15:42:43Z

The setup is 8 core node, CPU limit for history pods set to 1.

robholland requested a review from a team as a code owner May 9, 2023 10:07

Use different API to avoid automated log message.

7cdd559

MichaelSnowden reviewed May 9, 2023

View reviewed changes

MichaelSnowden approved these changes May 9, 2023

View reviewed changes

robholland merged commit 807b791 into master May 10, 2023

robholland deleted the rh-automaxprocs branch May 10, 2023 07:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add automaxprocs. #4301

Add automaxprocs. #4301

robholland commented May 9, 2023 •

edited

Loading

robholland commented May 9, 2023 •

edited

Loading

MichaelSnowden left a comment

robholland commented May 9, 2023

MichaelSnowden commented May 9, 2023

robholland commented May 9, 2023

robholland commented May 9, 2023

robholland commented May 9, 2023

Add automaxprocs. #4301

Add automaxprocs. #4301

Conversation

robholland commented May 9, 2023 • edited Loading

robholland commented May 9, 2023 • edited Loading

MichaelSnowden left a comment

Choose a reason for hiding this comment

robholland commented May 9, 2023

MichaelSnowden commented May 9, 2023

robholland commented May 9, 2023

robholland commented May 9, 2023

robholland commented May 9, 2023

robholland commented May 9, 2023 •

edited

Loading

robholland commented May 9, 2023 •

edited

Loading