Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc (create and start) or run detached (-d) tty issues #1218

Open
ronin13 opened this issue Dec 7, 2016 · 6 comments
Open

runc (create and start) or run detached (-d) tty issues #1218

ronin13 opened this issue Dec 7, 2016 · 6 comments

Comments

@ronin13
Copy link

ronin13 commented Dec 7, 2016

Even with 'terminal:false' (config.json), , runc checkpoint is failing for me (as in #87) with 'runc create...; runc start' or 'runc start -d'

(00.007329) Dumping opened files (pid: 17727)
(00.007330) ----------------------------------------
(00.007350) 17727 fdinfo 0: pos:                0 flags:           100002/0
(00.007359) tty: Dumping tty 10 with id 0x4
(00.007362) Error (criu/files-reg.c:1121): Can't lookup mount=25 for fd=0 path=/dev/pts/0
(00.007365) ----------------------------------------
(00.007371) Error (criu/cr-dump.c:1336): Dump files (pid: 17727) failed with -1

However, it works fine with 'runc start ' (without detach). Also noticed that setupIO in utils_linux.go has 'r.detach || r.create' passed to it. (should 'runc start' attach again?)

Moreover, with runc start -d, the stdout of the program (date) was being written to the terminal, even though it was detached.
ls -l /proc/$pid/fd showed the file descriptors (0,1,2) pointing to /dev/pts/XX.

With docker detached mode, they all point to pipe (to be attached again with attach).

total 0
lr-x------ 1 root root 64 Dec  7 21:41 0 -> 'pipe:[168045]'
l-wx------ 1 root root 64 Dec  7 21:41 1 -> 'pipe:[168046]'
l-wx------ 1 root root 64 Dec  7 21:41 2 -> 'pipe:[168047]'

So, I tested with

 diff --git a/utils_linux.go b/utils_linux.go
 index 94c520f..984b960 100644
 --- a/utils_linux.go
 +++ b/utils_linux.go
 @@ -95,18 +95,16 @@ func newProcess(p specs.Process) (*libcontainer.Process, error) {
  }

  func dupStdio(process *libcontainer.Process, rootuid, rootgid int) error {
 -       process.Stdin = os.Stdin
 -       process.Stdout = os.Stdout
 -       process.Stderr = os.Stderr
 -       for _, fd := range []uintptr{
 -               os.Stdin.Fd(),
 -               os.Stdout.Fd(),
 -               os.Stderr.Fd(),
 -       } {
 -               if err := syscall.Fchown(int(fd), rootuid, rootgid); err != nil {
 -                       return err
 -               }
 +
 +       i, err := process.InitializeIO(rootuid, rootgid)
 +       if err != nil {
 +               return err
         }
 +
 +       process.Stdin = i.Stdout
 +       process.Stdout = i.Stdin
 +       process.Stderr = i.Stdin
 +
         return nil
  }

Now, I see that runc checkpoint works fine. Also, 'run -d' doesn't write to terminal either. And, fds look like:

sudo ls -l /proc/24641/fd
total 0
lr-x------ 1 root root 64 Dec  7 23:00 0 -> 'pipe:[296288]'
l-wx------ 1 root root 64 Dec  7 23:00 1 -> 'pipe:[296287]'
l-wx------ 1 root root 64 Dec  7 23:00 2 -> 'pipe:[296287]'

The HEAD I tested with is, 47ea5c7 on master.

@ronin13
Copy link
Author

ronin13 commented Dec 7, 2016

I tested with lates HEAD with console-rewrite fixes and I can repeat the issue.

@kunalkushwaha
Copy link

This issue is same as #1202.

@cyphar
Copy link
Member

cyphar commented Dec 8, 2016

@ronin13 Unfortunately your fix won't work with detached containers. The reason why -d dups the hosts' stdio over the container process is so that runC can exit. If you use InitialiseIO and then exit, the other end of the pipes will be closed and the container process will get ECONNRESET when attempting to write or read to stdio.

To be clear terminal = "false" containers are not really well-defined by the spec, so we have a lot of "creative freedom" with what we can do here. However, I've honestly never liked the way we handle this in runC -- the current way this is done is actually exposing the host's terminal file descriptors which can be quite bad (there are some terminal-related attacks that you can do).

Your fix is (unfortunately) not the answer. According to @avagin this is a CRIU issue (which I err on the side of agreeing with), but I also feel like we should be doing something less insane with terminal: false and -d containers (like just duping /dev/null over their descriptors and calling it a day). Programs like sh won't like it, but anything is less than safe IMO.

@ronin13
Copy link
Author

ronin13 commented Dec 8, 2016

@cyphar Thanks for the detailed review.

With the 'sh -c ""while :; do date; sleep 3; done"' in (config.json) I was able to verify that process is able to write with https://gist.github.com/ronin13/1ef838ced63e562b458daa05320be7b0 with the process started as sudo runc --debug --log /tmp/xyz run -d test-container. Shouldn't the process die if it hits ECONNRESET (or EPIPE).

Regarding

(like just duping /dev/null over their descriptors and calling it a day)

I have seen detach (https://sourceforge.net/p/detach/wiki/Home/) doing this for stdin when running like
detach sh -c 'while :; do date; sleep 3; done .
with its descriptors looking like:

>>sudo ls -l /proc/13156/fd
total 0
lr-x------ 1 raghu raghu 64 Dec  8 12:00 0 -> /dev/null
lrwx------ 1 raghu raghu 64 Dec  8 12:00 1 -> /dev/pts/13
lrwx------ 1 raghu raghu 64 Dec  8 12:00 2 -> /dev/pts/13

However, since stdout still points to a tty, it writes to stdout even in detached case much like runc. So, probably, even the stdout needs to be null-ed as you mentioned above.

And regarding the CRIU issue, a consequence of this is that with detached run it writes to tty where it is invoked (so date is repeatedly printed to terminal where 'run -d' is invoked from), which may not be nice, and so, I feel has a scope beyond CRIU.

With docker, I have seen it writing to anon pipes (like in my first comment), and then with docker attach it attaches client's (0,1,2) to those pipes, and on detach, they go back to writing to those pipes.

In any case, it would also be good to document this since people may stumble with runc checkpoint/restore failing even with 'terminal:false' while running in detached mode.

@cyphar
Copy link
Member

cyphar commented Dec 8, 2016

@ronin13

Shouldn't the process die if it hits ECONNRESET (or EPIPE).

IIRC pipe(2)s have internal buffering (so you won't get an EPIPE or other error when writing to it on the first try). Try writing a lot of data and see how long the container lasts.

However, since stdout still points to a tty, it writes to stdout even in detached case much like runc.

Hmm I actually assumed that sh would just exit if stdin is /dev/null (bash has a history of fun "helpful checks" like that). But maybe switching to /dev/null wouldn't be too bad after all.

I feel has a scope beyond CRIU.

Yeah, it's a runC issue. We shouldn't be passing host terminal descriptors to containers. It's just not safe IMO. There's been a lot of "fun" discussion about that in opencontainers/runtime-spec#513 (specifically look at opencontainers/runtime-spec#513 (comment)).

With docker, I have seen it writing to anon pipes (like in my first comment), and then with docker attach it attaches client's (0,1,2) to those pipes, and on detach, they go back to writing to those pipes.

That's because of how Docker/containerd work. In effect, containerd-shim sets it's own stdio to be pipes that are then relayed to Docker. Trust me, it's not very pretty code (especially now that containerd supports being restarted). The point is that it only works in Docker's case because of how they've architected it (as a daemon that handles logging of containers). runC doesn't have the same design and thus has different concerns.

In any case, it would also be good to document this

To be honest, I'd prefer fixing it as opposed to documenting behaviour that is IMO quite dangerous.

@vincent-163
Copy link

vincent-163 commented Dec 17, 2019

Hello. I'm having exactly the same problem after 3 years. I tried to workaround it by specifying 0<>/dev/null 1<>/dev/null 2<>/dev/null while running runc start. The strange thing is that even though runc errors were suppressed, the container process still inherited the standard streams. ls -l /proc/1/fd in the container shows /dev/pts/2 for fd 0,1,2. A bit curious on why and how that happened, but anyway I managed to workaround by replacing the fds inside the container with a shell script, like this:

exec 0<>/dev/null 1<>/dev/null 2<>/dev/null

Hope this helps someone. BTW, why not replace all standard streams with /dev/null to the container init process? This makes it impossible to capture any logs, but I expect this to be at least better than not being able to detach at all.
EDIT: I found that this is because the fds were inherited from runc create command rather than runc start command. Passing 0<>/dev/null 1<>/dev/null 2<>/dev/null to runc create command solves the problem without additional shell script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants