Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix setup cgroup before prestart hook #1239

Merged
merged 1 commit into from
May 26, 2017
Merged

Conversation

moypray
Copy link
Contributor

@moypray moypray commented Dec 20, 2016

  • User Case:
    User could use prestart hook to add block devices to container. so the
    hook should have a way to set the permissions of the devices.

Just move cgroup config operation before prestart hook will work.

Signed-off-by: Wentao Zhang [email protected]

@moypray moypray mentioned this pull request Dec 20, 2016
@cyphar
Copy link
Member

cyphar commented Dec 20, 2016

I'm not convinced this is correct. The reason why cgroups were written to be applied that late is so that hooks and other runC internal processes don't hit the resource constraints for the container. For example, setting the pids.limit resource to be very small then you'll start hitting troubles if hook scripts start using a lot of resources.

@mrunalp
Copy link
Contributor

mrunalp commented Dec 20, 2016

@moypray Can't you setup the cgroups in your config.json?

@moypray
Copy link
Contributor Author

moypray commented Dec 21, 2016

@cyphar , I think the hook is not under control of container cgroup. because hook is not a child process of 'runc init' but parent runC. I think if the "runc init" meet the limit of cgroup should be ok. Is that right?

@moypray
Copy link
Contributor Author

moypray commented Dec 21, 2016

@mrunalp actually, if I use the config of container, will bring lots of troubles: such as config should be persistent and uniformity. (docker daemon will keep a cache in memory, I need to modify lots of codes to fix this, and would break the origin docker interface, It is terrible). But thanks all the same.

@hqhq
Copy link
Contributor

hqhq commented Dec 21, 2016

@cyphar hook processes are invoked by runc, they are not constrained by container resources. Or are you concerning that the manipulation of cgroup resources by hooks could be malicious?

I think it's reasonable that we allow prestart hooks to customize cgroup resources.
Design LGTM.

@cyphar
Copy link
Member

cyphar commented Dec 21, 2016

@hqhq

My concern is that a hook could do something like runc exec and I'm not sure if we should be enforcing cgroup limits on those processes. Not to mention that I don't think that hooks should be modifying stuff that was configured in config.json -- just seems like a footgun.

I think it's reasonable that we allow prestart hooks to customize cgroup resources.

In which case this patch should drop procReady entirely because it's not used for anything else IIRC.

@hqhq
Copy link
Contributor

hqhq commented Dec 21, 2016

My concern is that a hook could do something like runc exec and I'm not sure if we should be enforcing cgroup limits on those processes.

If hook wants to run processes in container, why should they not be constrained by cgroup resources? What if that a long-running forked process which can live with container processes?

Not to mention that I don't think that hooks should be modifying stuff that was configured in config.json -- just seems like a footgun.

I think the point is that hook authors should be responsible for the containers, most time they should be the same authors, like we permitted hooks to mess with container mounts.

In which case this patch should drop procReady entirely because it's not used for anything else IIRC.

This PR very likely needs to be amended, I haven't look it through, just talking about design first :)

@cyphar
Copy link
Member

cyphar commented Dec 21, 2016

@hqhq

I think the point is that hook authors should be responsible for the containers, most time they should be the same authors, like we permitted hooks to mess with container mounts.

Fair enough. I'm just worried about people deciding to use hooks as a way to avoid touching the OCI specification (like the PR author is trying to do). But ultimately some usecases are more complicated than what can be described in config.json. I'm conflicted to tell the truth, let's wait to see what the other @opencontainers/runc-maintainers think.

@moypray
Copy link
Contributor Author

moypray commented Dec 21, 2016

@cyphar , sorry for misunderstanding, I think the hook is running on host, the same with docker, runc. and customized by service provider themselves. We should trust the hook.

@moypray
Copy link
Contributor Author

moypray commented Dec 27, 2016

ping @opencontainers/runc-maintainers , How do you feel it, just talking about the design, not this PR implement, thx~

@cyphar
Copy link
Member

cyphar commented Dec 30, 2016

@moypray Most maintainers are on holidays now (including myself). I'd wait for a while before pinging people again, just because everyone is AFK at the moment. 😉

@moypray
Copy link
Contributor Author

moypray commented Jan 3, 2017

@cyphar , OK, thanks for your replay.

@hqhq
Copy link
Contributor

hqhq commented Jan 5, 2017

@cyphar Can you check how would this affect process overhead of pid cgroup? I see we already have inevitable overhead for pid cgroup, since the test is pretty loose #553 .

@moypray
Copy link
Contributor Author

moypray commented Jan 22, 2017

@opencontainers/runc-maintainers , what do you think of this proposal?

Can you check how would this affect process overhead of pid cgroup?

@hqhq I have done some testing based on my fix. And I am sure that there will not be more go routines or threads created between the two positions of 'Set cgroup' operation. So it will not affect the pid cgroup.
Using new runC with my fix, We can run docker run --pid-limits 2 busybox cat /sys/fs/cgroup/pids/pids.max successfully, if my fix affects the pids cgroup, it would run with failure.

@hqhq
Copy link
Contributor

hqhq commented Jan 22, 2017

@opencontainers/runc-maintainers PTAL, design and code both look good to me.

@3XX0
Copy link

3XX0 commented Apr 18, 2017

See also the original issue: #1044

@3XX0
Copy link

3XX0 commented May 4, 2017

@mrunalp This is what I was referring to for the GPU support in Kubernetes

@mrunalp
Copy link
Contributor

mrunalp commented May 4, 2017

This makes sense to me. @moypray can you rebase the PR and address comments?
@3XX0 Thanks for the reminder.

// Setup cgroup before prestart hook, so that the prestart hook could apply cgroup permissions.
if err := p.manager.Set(p.config.Config); err != nil {
return newSystemErrorWithCause(err, "setting cgroup config for ready process")
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT. Can we have slightly different error message text so it can be more easily determined where the error came from? Perhaps appending " in procReady" to the first message at line 339 and " in procHooks" here in the second?

@mrunalp
Copy link
Contributor

mrunalp commented May 18, 2017

@3XX0 Would you be willing to carry this over to a new rebased PR?

@hqhq
Copy link
Contributor

hqhq commented May 19, 2017

@moypray Please rebase, thanks.

@moypray
Copy link
Contributor Author

moypray commented May 19, 2017

hi @mrunalp @hqhq sorry for late replay. I will rebase it ASAP.

@moypray
Copy link
Contributor Author

moypray commented May 19, 2017

hi @mrunalp @hqhq done, Sorry again for late replay.

@hqhq
Copy link
Contributor

hqhq commented May 19, 2017

LGTM

Approved with PullApprove

* User Case:
User could use prestart hook to add block devices to container. so the
hook should have a way to set the permissions of the devices.

Just move cgroup config operation before prestart hook will work.

Signed-off-by: Wentao Zhang <[email protected]>
@3XX0
Copy link

3XX0 commented May 19, 2017

@mrunalp Sorry, saw it a little late ;)
The patch looks good to me though!

@moypray
Copy link
Contributor Author

moypray commented May 25, 2017

Ping @mrunalp

@mrunalp
Copy link
Contributor

mrunalp commented May 25, 2017

LGTM

Approved with PullApprove

@hqhq hqhq merged commit d7c264a into opencontainers:master May 26, 2017
@moypray moypray deleted the cgroup branch May 26, 2017 03:15
runcom added a commit to projectatomic/runc that referenced this pull request Jul 17, 2017
Upstream reference: opencontainers/runc#1239
Fixes: stefwalter/oci-kvm-hook#3

* User Case:
User could use prestart hook to add block devices to container. so the
hook should have a way to set the permissions of the devices.

Just move cgroup config operation before prestart hook will work.

Signed-off-by: Antonio Murdaca <[email protected]>
runcom added a commit to projectatomic/runc that referenced this pull request Jul 17, 2017
Upstream reference: opencontainers/runc#1239
Fixes: stefwalter/oci-kvm-hook#3

* User Case:
User could use prestart hook to add block devices to container. so the
hook should have a way to set the permissions of the devices.

Just move cgroup config operation before prestart hook will work.

Signed-off-by: Antonio Murdaca <[email protected]>
runcom added a commit to projectatomic/runc that referenced this pull request Jul 17, 2017
Upstream reference: opencontainers/runc#1239
Fixes: stefwalter/oci-kvm-hook#3

* User Case:
User could use prestart hook to add block devices to container. so the
hook should have a way to set the permissions of the devices.

Just move cgroup config operation before prestart hook will work.

Signed-off-by: Antonio Murdaca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants