Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cpuset incorrectly set #143

Merged
merged 3 commits into from
Jul 10, 2020
Merged

Cpuset incorrectly set #143

merged 3 commits into from
Jul 10, 2020

Conversation

dahankzter
Copy link
Contributor

This PR sets the resources limits of all the containers, both init- and regular containers, explicitly to ensure QoS Guaranteed.
Furthermore it makes a minor validation that the cpuset and shard (--smp) settings are optimal. It will not block deployment but logs it to inform the user that there may be a problem.

Fixes: #121, #135

The validation now checks that the resource limit and requests are
equal to enforce quaranteed qos.
The operator now takes care of setting the reqests and limits properly
for all containers and init containers to ensure a QoS classification
of Guaranteed.
@dahankzter dahankzter requested a review from mmatczuk June 26, 2020 12:50
@mmatczuk
Copy link
Contributor

As a side note if one does not specify cpuset the docker container would add overprovisioned. It breaks performance and we are not overprovisioned. So please always add overprovisioned = 0.

@dahankzter
Copy link
Contributor Author

As a side note if one does not specify cpuset the docker container would add overprovisioned. It breaks performance and we are not overprovisioned. So please always add overprovisioned = 0.

Added this unconditionally.

@@ -33,16 +33,18 @@ func checkValues(c *scyllav1alpha1.Cluster) (allowed bool, msg string) {
return false, fmt.Sprintf("when using cpuset, you must use whole cpu cores, but rack %s has %dm", rack.Name, cores)
}

// Requests == Limits or only Limits must be set for QOS class guaranteed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It implies that we'd have Guaranteed QoS only if CpuSet is enabled.
Isn't it that we should always use Guaranteed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends. Do we want it to be possible to run another config or not? I don't know if it's too strict or not we should hear from the field more on this. @gnumoreno what do you think? Always enforce QoS Guaranteed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is more if any user ever will run without cpuset? What are the consequences of not applying cpuset but going for Guaranteed QoS? I think none right? In case we get Guaranteed but don't give these cores to scylla it should still be fine.

@mmatczuk
Copy link
Contributor

I fail to see overprovisioned in this patch.

@@ -206,6 +214,18 @@ func (s *ScyllaConfig) setupEntrypoint(ctx context.Context) (*exec.Cmd, error) {
return scyllaCmd, nil
}

func (s *ScyllaConfig) validateCpuSet(ctx context.Context, cpusAllowed string, shards int) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a plain function?
Please add a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but I wanted access to the logger for that information log about less than optimal setup. It's not a problem to pass it in though.

@dahankzter
Copy link
Contributor Author

I fail to see overprovisioned in this patch.

I don't get it... https://github.com/scylladb/scylla-operator/compare/cpuset_incorrectly_set has the overprovisioned commit but it can't be seen here. Has something changed wrt how PRs works or...

@mmatczuk
Copy link
Contributor

Could we have a unit test to check that all limits are set correctly?

Also disabling running in overprovision mode
@dahankzter dahankzter force-pushed the cpuset_incorrectly_set branch from 07385b8 to 70fa793 Compare June 29, 2020 14:05
@dahankzter
Copy link
Contributor Author

Could we have a unit test to check that all limits are set correctly?

It depends on what you mean "set correctly". We can either log "non-optimal setup detected please consider changing...." or we can fail the deploy when the user requests cpuset since then they should know what they are doing.

@gnumoreno
Copy link
Contributor

A little background
scylladb/scylladb#3336
scylladb/scylladb#2114

FYI

@dahankzter dahankzter merged commit f4581bb into master Jul 10, 2020
@zimnx zimnx deleted the cpuset_incorrectly_set branch February 15, 2021 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cpuset is wrongly set
4 participants