Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using qrexec-client for guest-initiated qrexec calls #9066

Closed
3 tasks done
DemiMarie opened this issue Mar 29, 2024 · 0 comments
Closed
3 tasks done

Stop using qrexec-client for guest-initiated qrexec calls #9066

DemiMarie opened this issue Mar 29, 2024 · 0 comments
Assignees
Labels
C: core P: default Priority: default. Default priority for new issues, to be replaced given sufficient information.

Comments

@DemiMarie
Copy link

How to file a helpful issue

The problem you're addressing (if any)

qrexec-daemon uses execve() to spawn qrexec-client for every VM-initiated service call. This is inefficient.

The solution you'd like

Have qrexec-daemon do the work itself.

The value to a user, and who that user might be

Users will benefit from faster qrexec calls.

Completion criteria checklist

(This section is for developer use only. Please do not modify it.)

  • Move code from qrexec-client.c into a new shared qrexec-daemon-common.c.
  • Avoid using qrexec-client for VM ⇒ VM service calls.
  • Avoid using qrexec-client for VM ⇒ dom0 service calls.
@DemiMarie DemiMarie added T: enhancement P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Mar 29, 2024
@DemiMarie DemiMarie self-assigned this Mar 29, 2024
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Mar 29, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is newer than
qrexec-client, causing the new qrexec-daemon to try to use a calling
convention that the old qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_flush().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_flush() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Mar 29, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Mar 29, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 2, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 2, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is newer than
qrexec-client, causing the new qrexec-daemon to try to use a calling
convention that the old qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_flush().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_flush() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 2, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 3, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is newer than
qrexec-client, causing the new qrexec-daemon to try to use a calling
convention that the old qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_flush().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_flush() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 3, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 5, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is older than
qrexec-client, causing the old qrexec-daemon to try to use a calling
convention that the new qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_flush().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_flush() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 5, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 5, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

For the same reasons as the previous commit, it is critical to avoid
calling exit() in the child process, either directly or indirectly.
Therefore, ensure that none of the code that is now run in the child
process calls exit().  There were places where this was done previously,
so this commit also fixes a bug.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 10, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is older than
qrexec-client, causing the old qrexec-daemon to try to use a calling
convention that the new qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure proper behavior if there are any remaining functions calls to
exit(), add an atexit() hook in the child process that calls
__gcov_dump() (if coverage is enabled) followed by _exit(126).  All of
these calls will be in error paths (if not, there is a bug somewhere),
so the fixed exit code is okay.  Since atexit() hooks are run in reverse
order of registration, the call to _exit() will prevent other hooks
(such as the one that cleans up the listening sockets) from running.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_dump().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_dump() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 10, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 11, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is older than
qrexec-client, causing the old qrexec-daemon to try to use a calling
convention that the new qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure proper behavior if there are any remaining functions calls to
exit(), add an atexit() hook in the child process that calls
__gcov_dump() (if coverage is enabled) followed by _exit(126).  All of
these calls will be in error paths (if not, there is a bug somewhere),
so the fixed exit code is okay.  Since atexit() hooks are run in reverse
order of registration, the call to _exit() will prevent other hooks
(such as the one that cleans up the listening sockets) from running.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_dump().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_dump() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 11, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

The large refactoring fixes another bug: skip-service-descriptor=true
was ignored for dom0 socket services.  Thankfully this feature has not
made it into a release yet.

Fixes: QubesOS/qubes-issues#9066
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 11, 2024
This saves an execve().  It also allows qrexec-daemon to change how it
performs a VM -> VM service call without having to change qrexec-client.
During an upgrade, it is possible that qrexec-daemon is older than
qrexec-client, causing the old qrexec-daemon to try to use a calling
convention that the new qrexec-client doesn't support.  Doing VM -> VM
calls without calling execve() means this cannot happen.  VM -> dom0
and dom0 -> VM calls still use qrexec-client, but VM -> dom0 calls are
safe from domain name reuse races as of [1], and the latter do not
involve qrexec-daemon at all.

qrexec-daemon uses atexit() hooks to clean up its listening sockets, so
it is critical that these hooks do _not_ run in the child process.
Therefore, change the functions that used exit(), err(), or errx() to
return normally or call abort().  The return value is checked by the
caller and the functions are marked __attribute__((warn_unused_result))
to ensure this.  This also fixes some (but not all) cases where a
disposable VM would not be cleaned up by qrexec-client -k.

To ensure proper behavior if there are any remaining functions calls to
exit(), add an atexit() hook in the child process that calls
__gcov_dump() (if coverage is enabled) followed by _exit(126).  All of
these calls will be in error paths (if not, there is a bug somewhere),
so the fixed exit code is okay.  Since atexit() hooks are run in reverse
order of registration, the call to _exit() will prevent other hooks
(such as the one that cleans up the listening sockets) from running.

To ensure that code running in a child process gets coverage measured in
CI, it is necessary to add calls to __gcov_dump().  Add these calls
by means of a wrapper function around _exit(), which is much less
error-prone than calling __gcov_dump() and _exit() directly.

Part of QubesOS/qubes-issues#9066

[1]: 100fbb9 ("qrexec-client: Use XID to connect to qrexec daemon when possible")
DemiMarie added a commit to DemiMarie/qubes-core-qrexec that referenced this issue Apr 11, 2024
It isn't needed in this case either.  This saves an execve() and makes
it easier to change the behavior of qrexec-client without breaking
existing qrexec-daemon processes.

This also ensures that the service configuration loaded for VM -> dom0
service calls affects the 'struct qrexec_parsed_command' used for
service execution, which fixes a bug: skip-service-descriptor=true
was ignored for dom0 socket services.  Thankfully this feature has not
made it into a release yet.

Fixes: QubesOS/qubes-issues#9066
Fixes: c6ec6ef ("Support not passing metadata to socket-based services")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core P: default Priority: default. Default priority for new issues, to be replaced given sufficient information.
Projects
None yet
Development

No branches or pull requests

2 participants