-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not make a file description blocking while it is still in use #151
Conversation
@marmarek can you test this? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #151 +/- ##
==========================================
- Coverage 77.64% 77.54% -0.11%
==========================================
Files 54 54
Lines 9412 9435 +23
==========================================
+ Hits 7308 7316 +8
- Misses 2104 2119 +15 ☔ View full report in Codecov by Sentry. |
It doesn't help. |
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2024043003-4.2&flavor=pull-requests New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.2&build=2024041501-4.2&flavor=update
Failed tests36 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/97064#dependencies 19 fixed
Unstable tests
|
66a82e8
to
d1f5252
Compare
c8a4d61
to
af79fb5
Compare
af79fb5
to
9a26a45
Compare
To avoid breaking on musl libc and to ensure correct behavior if |
If the file descriptor numbers were bad, a dup2() could clobber a file descriptor that gets used later.
10df2e5
to
b5a027a
Compare
This is both more correct (it avoids assuming an arbitrary maximum file descriptor value) and faster. If close_range() is not supported by the kernel or libc, fall back to the old code.
The status pipe was closed by fix_fds(), so all of the subsequent operations on it were operating on a closed file descriptor. If exec_func() returned, the subsequent write() would operate on a file descriptor that might have been reused for another purpose, but none of the registered execution functions ever returned. Just call abort() if they do return.
There is no need for stdin_fd and stdout_fd to be distinct file descriptors, so long as the code is careful to not close a file descriptor that is still in use. This extra care allows for close() to be used in preference to shutdown(), unless the same socket is being used for both stdin and stdout. close() has the advantage that it does not interfere with other users of the same file description, which allows the "was this inherited from another process?" hack to be eliminated. Instead, use the file descriptor number to determine whether blocking mode needs to be restored - it only needs to be restored for inherited file descriptors. To avoid lots of error-prone boilerplate, use a single function to close stdin_fd or stdout_fd, and wrap it with macros that select the right one to use. These macros also pass the correct argument for shutdown(2) and set the just-closed FD to -1, eliminating a nasty potential source of bugs. The function automatically chooses between close(2) and shutdown(2) based on whether the two file descriptor arguments it is passed are identical, ensuring that shutdown(2) is only used if needed. Furthermore, the function knows (based on the file descriptor number) if the file descriptor was inherited from another process, and therefore whether it needs to be restored to blocking mode. This also avoids making a file descriptor blocking while still in use. Since the code now knows if stdin and stdout use the same file descriptor, it can avoid setting the blocking flag on a file descriptor that is still in use, which could cause a deadlock. This works in every case except where both stdin and stdout are passed from a parent process and share a file description. git blame points to commit 0802df7 ("Factor out process_child_io"), but I suspect the actual bug dates back as far as the introduction of socket-based services in 76697e3 ("Working socket-based qrexec").
They make sense for local processes, but for remote processes, stdin_fd and stdout_fd are backwards, and stderr_fd is -1.
b5a027a
to
e594544
Compare
It is critical that EOF from the service gets propagated to the stdout of qrexec-client-vm or qrexec-client. Otherwise, the caller will not recognize that EOF has happened and may wait forever for data that will never arrive. In QubesOS/qubes-issues#9169, this caused curl to hang when reading a plaintext HTTP/1.0 response from tinyproxy, which relies on EOF to delimit the response. The bug will trigger iff all of the following conditions are met: 1. The remote side must close the writing side of the connection, either by closing or shutting down a socket. 2. If the remote service is executable, it must _not_ exit. 3. qrexec-client-vm or qrexec-client must _not_ get EOF on stdin. 4. Either the remote side does not close its stdin, or the local side does not continue to write data. 5. The same file description is used for both stdin and stdout of the local qrexec-client or qrexec-client-vm process. 6. qrexec-client or qrexec-client-vm connect the stdin and stdout of the remote process to their own stdin and stdout, not the stdin and stdout of a locally spawned command. To fix this bug, add flags to qrexec-client and qrexec-client-vm that cause the socket passed as file descriptor 0 to be used for both stdin and stdout. Fixes: QubesOS/qubes-issues#9169
e594544
to
c2f197c
Compare
use_stdio_socket means that stdin and stdout FDs use the same file description, which is certainly the case for socket-based services. Before this change, it was set to false in this case. This would cause the file description to be made blocking while it is still in use, potentially causing deadlocks.
git blame points to commit 0802df7 ("Factor out process_child_io"), but I suspect the actual bug dates back as far as the introduction of socket-based services in 76697e3 ("Working socket-based qrexec").
Fixes: QubesOS/qubes-issues#9169