Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rootfs: make pivot_root not use a temporary directory #1125

Merged
merged 1 commit into from
Oct 24, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions libcontainer/configs/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,11 +85,6 @@ type Config struct {
// that the parent process dies.
ParentDeathSignal int `json:"parent_death_signal"`

// PivotDir allows a custom directory inside the container's root filesystem to be used as pivot, when NoPivotRoot is not set.
// When a custom PivotDir not set, a temporary dir inside the root filesystem will be used. The pivot dir needs to be writeable.
// This is required when using read only root filesystems. In these cases, a read/writeable path can be (bind) mounted somewhere inside the root filesystem to act as pivot.
PivotDir string `json:"pivot_dir"`

// Path to a directory containing the container's root filesystem.
Rootfs string `json:"rootfs"`

Expand Down
74 changes: 46 additions & 28 deletions libcontainer/rootfs_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ func setupRootfs(config *configs.Config, console *linuxConsole, pipe io.ReadWrit
if config.NoPivotRoot {
err = msMoveRoot(config.Rootfs)
} else {
err = pivotRoot(config.Rootfs, config.PivotDir)
err = pivotRoot(config.Rootfs)
}
if err != nil {
return newSystemErrorWithCause(err, "jailing process inside rootfs")
Expand Down Expand Up @@ -563,48 +563,66 @@ func setupPtmx(config *configs.Config, console *linuxConsole) error {
return nil
}

func pivotRoot(rootfs, pivotBaseDir string) (err error) {
if pivotBaseDir == "" {
pivotBaseDir = "/"
}
tmpDir := filepath.Join(rootfs, pivotBaseDir)
if err := os.MkdirAll(tmpDir, 0755); err != nil {
return fmt.Errorf("can't create tmp dir %s, error %v", tmpDir, err)
// pivotRoot will call pivot_root such that rootfs becomes the new root
// filesystem, and everything else is cleaned up.
func pivotRoot(rootfs string) error {
// While the documentation may claim otherwise, pivot_root(".", ".") is
// actually valid. What this results in is / being the new root but
// /proc/self/cwd being the old root. Since we can play around with the cwd
// with pivot_root this allows us to pivot without creating directories in
// the rootfs. Shout-outs to the LXC developers for giving us this idea.

oldroot, err := syscall.Open("/", syscall.O_DIRECTORY|syscall.O_RDONLY, 0)
if err != nil {
return err
}
pivotDir, err := ioutil.TempDir(tmpDir, ".pivot_root")
defer syscall.Close(oldroot)

newroot, err := syscall.Open(rootfs, syscall.O_DIRECTORY|syscall.O_RDONLY, 0)
if err != nil {
return fmt.Errorf("can't create pivot_root dir %s, error %v", pivotDir, err)
return err
}
defer func() {
errVal := os.Remove(pivotDir)
if err == nil {
err = errVal
}
}()
if err := syscall.PivotRoot(rootfs, pivotDir); err != nil {
defer syscall.Close(newroot)

// Change to the new root so that the pivot_root actually acts on it.
if err := syscall.Fchdir(newroot); err != nil {
return err
}

if err := syscall.PivotRoot(".", "."); err != nil {
// Make the parent mount private
if err := rootfsParentMountPrivate(rootfs); err != nil {
if err := rootfsParentMountPrivate("."); err != nil {
return err
}

// Try again
if err := syscall.PivotRoot(rootfs, pivotDir); err != nil {
if err := syscall.PivotRoot(".", "."); err != nil {
return fmt.Errorf("pivot_root %s", err)
}
}
if err := syscall.Chdir("/"); err != nil {
return fmt.Errorf("chdir / %s", err)

// Currently our "." is oldroot (according to the current kernel code).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what does this comment mean. I suspect you are saying that cwd of process has not changed and oldroot is continuing to be cwd of the process?

If yes, old root is "/" and that's was not necessarily cwd of the process at the time of call to pivot_root().

May be say something like. pivot_root() does not guarantee what will happen to cwd of the calling process. It is a possibility that pivot_root() changed cwd to new root and in that case following umount() will fail. So to be safe, change dir to oldroot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you're right (I thought about it some more). pivot_root(a, b) will make a the new root and mount the old root at b. What we're doing here is unmounting the old root, because the old root is also on b. Since b is . it matters whether or not pivot_root has changed your cwd. It turns out that this does change your cwd in the current Linux implementation, but I don't want to depend on that (the mount logic is very dodgy in the kernel).

// However, purely for safety, we will fchdir(oldroot) since there isn't
// really any guarantee from the kernel what /proc/self/cwd will be after a
// pivot_root(2).

if err := syscall.Fchdir(oldroot); err != nil {
return err
}
// path to pivot dir now changed, update
pivotDir = filepath.Join(pivotBaseDir, filepath.Base(pivotDir))

// Make pivotDir rprivate to make sure any of the unmounts don't
// propagate to parent.
if err := syscall.Mount("", pivotDir, "", syscall.MS_PRIVATE|syscall.MS_REC, ""); err != nil {
// Make oldroot rprivate to make sure our unmounts don't propogate to the
// host (and thus bork the machine).
if err := syscall.Mount("", ".", "", syscall.MS_PRIVATE|syscall.MS_REC, ""); err != nil {
return err
}
// Preform the unmount. MNT_DETACH allows us to unmount /proc/self/cwd.
if err := syscall.Unmount(".", syscall.MNT_DETACH); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to mark all old mounts as private before umounting them
syscall.Mount("", ".", "", syscall.MS_PRIVATE|syscall.MS_REC, "")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@avagin Yes, we were making the old mounts private before but it got removed in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, whoops. I'll fix this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've applied this fix. PTAL.

return err
}

if err := syscall.Unmount(pivotDir, syscall.MNT_DETACH); err != nil {
return fmt.Errorf("unmount pivot_root dir %s", err)
// Switch back to our shiny new root.
if err := syscall.Chdir("/"); err != nil {
return fmt.Errorf("chdir / %s", err)
}
return nil
}
Expand Down