-
Notifications
You must be signed in to change notification settings - Fork 316
Propogate container's mountpoint to the host #632
Conversation
Will #609 help if we allow containers so share namespaces with each others ? The referenced PR implements pid namespace, but I believe that mount namespace is implementable with some small changes. |
@dqminh I believe pid namespace is a stretch for the use case here. For example, if the container joins the host's namespace ( |
@rootfs ahh so sorry for not reading the attached use case clearly. So the use case is about containerizing kubelet, i.e., kubelet can create and mount a directory on the host, and instruct docker to use that directory as a docker volume ? If so, yes, i doubt that sharing mount namespace is going to help. |
@dqminh you are right, one use case is for containerized kubelet to create and mount a directory on the host. But some filesystems (e.g. glusterfs and cephfs) invokes mount helpers, if these helpers are not installed on the host containerized kubelet cannot make the mount point show up on the host namespace. Sharing mount namespace solves this problem: container installs the mount helpers, binds host's rootfs, mounts filesystems inside a shared rootfs. |
This is similar to #623 with the difference being SLAVE/SHARED. Maybe it makes sense to have a single config field and set it private/slave/shared as required. |
@mrunalp I agree. Although, I think you would want this as docker run time rather then setting up the daemon. |
@rhatdan Do you mean as a flag to docker run? |
love the flag idea |
Yes we need to do something like: docker run --rootmount=shared fedora And set the default: docker -d --rootmount=shared |
@rhatdan Yes, that should work. It needs to be proposed to docker. |
When containers are launched with "shared", I think any mount done by container will be visible only to docker daemon and not on the host. So this will still work if you want to bind mount that mount point to other containers (as docker daemon sees it). But any utilities on the host still can't see it. |
Yes, that is true since the daemon is started in its own mount namespace. The utilities would have to join that mount namespace to see the mounts. Sent from my iPhone
|
I think we need to take a step back and explain what the expected outcomes you think is correct in certain situations. Examples:
If we can figureout the expected outcomes its easier to find a solution. |
Current a container has to nsenter the host's mount namespace to mount filesystem and share with other containers. This approach doesn't work if the filesystem mount calls helper utility (/sbin/mount.XXX). This commit provides a new flag and makes rootfs sharable. Signed-off-by: Huamin Chen <[email protected]>
following your examples, here are my views based on my understanding based on [1]
Reference |
Yes I agree with this, and we can ignore the "Daemon" problem for now, lets get libcontainer to work correctly and allow us to run containers in different modes depending on the users goal. If we go with use cases. Shared: On a projectatomic host, I want to install a SPC container, which contains gluster or cephs userspaces, I want to use these container userspaces to mount file systems that can be used on the host and by other containers. Slave: I want to run a container with a volume mount (bind) from the host, which the admin can later mount file systems on top of and these volumes can be seen inside of the container. A standard use case of this would be autofs. Private, As a user I want to be able to run a container, which has a volume mount. I do not want any mount changes on this volume to be seen within the container. I would argue that Private is the least likely to be requested. |
Looking forward to this feature, as I'd like easier Docker filesystem mounting too. It looks like this isn't the home of libcontainer anymore though, now that there's opencontainers/runc. @rootfs Perhaps you should submit a PR over there now? I'd hate to see this feature abandoned because it got lost in the move. |
@tjdett sure, will submit to runc soon. thank you for the information. |
Just want to note that for the specific use-case we have for running the kubelet in a container is currently implemented as:
So, the current solution we have for this would work with the slave propagation mode. However, as @rootfs has pointed out, there are use-cases which we currently don't support for the containerized kubelet such as gluster, cephfs, etc, which would depend on the mount helpers / daemons being present on the host system, and so aren't really appropriate to go out to the host for. For these use-cases, we would require the shared mode in order to run the mount helpers / daemons / etc inside the container and eliminate dependencies on the host's setup. Personally I think this should be a flag on the bind-mount spec itself since there will be different requirements probably for different volumes. I could imagine:
as a possible syntax. I think most admins would probably prefer to use private propagation modes wherever possible (but that is just my gut feeling). |
I am fine with this, although I feel that a single flag for docker run would be fine in almost all cases. I would argue that most admin would expect SLAVE, and most ADMINS and developers would have no idea what we are talking about and have a hard time understanding what is going on. Admins would expect that if a volume is mounted into a container and then I later mount on top of the directory, that this new mount point would show up inside the container. That is our experience with using mount namespaces all the way back to RHEL5. |
Was ported to runc as opencontainers/runc#77 |
Current a container has to nsenter the host's mount namespace to mount filesystem and
share with other containers. This approach doesn't work if the filesystem mount
calls helper utility (/sbin/mount.XXX). This limitation makes containerized kubelet unable to mount certain filesystems.
This commit provides a new flag to make rootfs sharable. Since moving a shared rootfs is semantically confusing for
pivot_root(2)
andMS_MOVE
. A new functionchangeRoot()
is provided to switch rootfs to new destination.Signed-off-by: Huamin Chen [email protected]