Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS: pass JackMachSemaphore send right directly via shm #785

Closed
wants to merge 1 commit into from

Conversation

p--b
Copy link
Contributor

@p--b p--b commented Aug 4, 2021

Previously, JackMachSemaphore would communicate the send right for the
semaphore object from the server to the client via a named service
regstered via bootstrap_register. However,

  • This is deprecated (see bootstrap.h) - the recommendation is to pass
    the send right for the port directly, rather than indirecting via the
    service name and the bootstrap server

  • This led to macOS: clients can only connect 98 times per boot #784, wherein old service names were not cleaned up, and
    the server would abandon attempting to create the 99th service, leading
    to clients failing to connect after they had connected 98 times on a
    given boot of the OS.

This commit implements the advice from the deprecation notice in
bootstrap.h, communicating the send right for the semaphore port
directly via the shared memory segment that we used to use to
communicate the service name, bypassing the need to register a named
service entirely. This resolves #784.

NB: I am unsure what guarantees JACK makes about the stability of the client/server
interface, or exactly how clients load the dylib containing this functionality - but I assume this
would not be backwards-compatible between different client/server versions (?) - I'm unsure of
what the implications of this are or if some protocol version bump / similar would be required?

Previously, JackMachSemaphore would communicate the send right for the
semaphore object from the server to the client via a named service
regstered via `bootstrap_register`. However,

 * This is deprecated (see bootstrap.h) - the recommendation is to pass
 the send right for the port directly, rather than indirecting via the
 service name and the bootstrap server

 * This led to jackaudio#784, wherein old service names were not cleaned up, and
 the server would abandon attempting to create the 99th service, leading
 to clients failing to connect after they had connected 98 times on a
 given boot of the OS.

 This commit implements the advice from the deprecation notice in
 bootstrap.h, communicating the send right for the semaphore port
 directly via the shared memory segment that we used to use to
 communicate the service name, bypassing the need to register a named
 service entirely. This resolves jackaudio#784.
@falkTX
Copy link
Member

falkTX commented Aug 4, 2021

Thanks a lot!

Changes seem all good, but we need to confirm that arm64 + x64 interoperability still works.
Do you have an M1 machine?
Since we have automated builds now, we can try those right away. (with the macos-universal artifact)

PS: There are no worries regarding keeping ABI, as JACK lib and server must always be installed together.

@p--b
Copy link
Contributor Author

p--b commented Aug 4, 2021

Unfortunately I don't have access to any M1 hardware at this time - sorry! Hence my slight naïveté on this;

To be clear, is the interoperability question if you have a native aarch64 client talking to a jackd running via rosetta 2?
Or just both client&server running natively on aarch64? I'm not sure what is supported?

If the latter then mach_port_t ought to be fine going via shm; afaict mach_port_t is just a pointer to a struct ipc_port (https://developer.apple.com/documentation/kernel/mach_port_t, https://opensource.apple.com/source/xnu/xnu-201/osfmk/ipc/ipc_port.h.auto.html) so it's just passing a pointer around.

However I can imagine pointers could look different between the two arches, if that's what's expected to work? I'm afraid I have little idea how rosetta2 actually works, so whether a mach_port_t (i.e. a pointer) provided by darwin to an emulated binary could be used by a native binary is... perhaps questionable?! But idk if that's what you mean?

@falkTX
Copy link
Member

falkTX commented Aug 4, 2021

We can have an arm64 jackd server that is able to listen and talk to both native arm64 clients and x64 ones via rosetta2.
I have confirmed this myself.

Though somehow a x64 jackd server cannot establish connections with arm64 clients. I dont know why at this point.

And yeah, this being in shared memory can be questionable for multi-arch access.
But since pointer size of arm64 and x64 are both 8 bytes, maybe it can just work.
We can put the pointer as union so that it always reserves 8 bytes for it, but since Apple is not going back to 32bit, lets not care for now.

I have an M1 myself, will set it up again for use and try this out soon.

@p--b
Copy link
Contributor Author

p--b commented Aug 4, 2021

Brilliant, thanks. Let me know if you need anything else from me.

@falkTX
Copy link
Member

falkTX commented Aug 7, 2021

On some quick testing, this does not seem like it is working at all.
The semaphore waiting fails, which is expected I guess since these semaphores by design are not meant to be used for IPC.
As I understood from the old code, the bootstrap registers a service that allows to use the semaphore globally.

FYI jack_lsp is not a good test, since it does not activate the client and thus doesnt need/use the semaphores.
For real testing we need clients that register ports and connect them to something.

I will give this more testing at a later point. But it doesn't sound promising.

@p--b
Copy link
Contributor Author

p--b commented Aug 8, 2021

Interesting. Apologies, I hadn't realised that jack_lsp wouldn't exercise this. Indeed trying with jack_simple_client it is broken.

I'd be kinda surprised if a semaphore didn't work across processes - however - It seems I'd missed a crucial bit of understanding, which is that each mach task has its own private port table, so a pointer to a given port won't be valid in another task's context unless explicitly moved via a mach message (obligatory stack overflow clues are available).

From that it seems there's two threads to pull on;

  1. continue with the same approach, just send the semaphore port via a mach_msg somehow instead of via shm
  2. use a SysV named semaphore instead

I can have a look into either/both approaches next week - any thoughts / past experience that would be relevant before I have a go?

@falkTX
Copy link
Member

falkTX commented Aug 8, 2021

not much familiar with point 1, but worth noting perhaps that jack allows more than 1 server to be active at the same time (using different server names). so it needs to remain compatible (or better said, not become incompatible)

there seems to be some issues with named semaphores and sandboxing https://stackoverflow.com/questions/24451774/how-to-enable-ipc-sysv-sem-in-mac-sandboxing
also note that you cant use sem_timedwait because it is not implemented on macOS, in case you thought of POSIX ones.

@p--b
Copy link
Contributor Author

p--b commented Aug 8, 2021

Thanks, yes this makes sense. Maybe (1) would be easier, which would involve going back to using the bootstrap server to exchange ports between the processes... digging into it more I have a hunch that what was happening is that as we were registering the semaphore's port with the bootstrap server (not a port directly belonging to the jackd task), the semaphore's port perhaps continued to exist and never became dead, even when the semaphore was destroyed, which prevented registering a replacement with the bootstrap server (maybe?).

A solution to this might therefore to be to use mach_port_allocate to create a port with the receive right belonging to the jackd task, and register that port with the bootstrap server. The client then looks up that port, sends a message to jackd requesting a send right to the semaphore and providing a send right back to the client, and jackd then finally sends a send right to the semaphore port to the client. A bit convoluted but might work. Major faff would be that the mach_msg primitive used to do this is blocking with a timeout, so we'd probably need to spin up a bonus thread in at least the server to handle receiving the request from the client and sending a reply back to it...

The more I think about this the less enamoured with it I become. Maybe I'll go back to trying to work out why the semaphore port wasn't becoming dead when the semaphore was destroyed, because simply replacing the old dead port with a new port with the same service name in the bootstrap server would be much cleaner...

@p--b
Copy link
Contributor Author

p--b commented Aug 8, 2021

Well, I got sniped...

So, after having spent entirely too much time browsing the available source for launchd (which provides the bootstrap server), it seems that launchd performs its own book-keeping of which services are "active", relying on some slightly hairy logic and notifications from the kernel when things die. I didn't quite get to the bottom of why but it seems that any service which has its port as a semaphore send right will never become "inactive" within launchd - I suspect it's because the semaphore's ports are in a special kernel IPC domain and so they don't appear to belong to the user when launchd is notified that the port has died. As the service never becomes inactive in launchd's eyes, it will never allow the service's ports to be replaced. This means that any approach that relies on registering a semaphore send right with the bootstrap server directly won't work (at least until this ?bug? is fixed in macOS).

Investigating non-semaphore ports being registered with the bootstrap server showed them being cleaned up correctly when jackd died, which led me back to my previous thought of registering normally allocated ports and doing a mach_msg handshake. I've done a quick and dirty test implementation and it seems to work (testing with jack_simple_client this time ;) ). It is now very much bedtime but I will tidy it up and update this PR sometime tomorrow later today.

tldr; fixed via new approach based on mach_msg, updated PR incoming.

@p--b p--b closed this Aug 8, 2021
@p--b p--b deleted the semaphore-no-bootstrap branch August 8, 2021 23:36
@p--b p--b restored the semaphore-no-bootstrap branch August 8, 2021 23:41
p--b added a commit to fourieraudio/jack2 that referenced this pull request Aug 8, 2021
Previously, JackMachSemaphore would communciatee the send right for the
semaphore object from the server to a client via a named service
registered via `bootstrap_register`. However, to do this, it would
register the semaphore's port as the service port directly.

In theory this ought to be fine, however in practice, macOS `launchd`,
which provides the `bootstrap_register` interface, does not correctly
detect when such a port becomes dead, and incorrectly believes that the
service that it provides is forever alive, even past the end of the
`jackd` process' (and therefore the semaphore's) existence. This seems
to be *specific* to semaphore ports, as `launchd` is expecting a
standard IPC port, owned by the task, not the kernel. This prevents
`jackd` from later registering another service with the same name, as
launchd rejects the registration as conflicting with an active service.

To get around this, `jackd` previously added a counter to the end of the
named service registrations, allowing old services to remain in the
system until the end of the session. To prevent things getting out of
hand, this was capped at 98 service registrations for a given semaphore
name. This led to jackaudio#784, in which running a client for the 99th time
resulted in the semaphore creation failing and the client failing to
connect.

As launchd outlives mutliple runs of `jackd`, this situation persisted
across restarts of jackd, requiring a restart of the user's session
(i.e. a reboot) to fix.

An initial attempt at fixing this (see jackaudio#785) tried passing the port
rights directly via shared memory, however mach is too clever for us and
foils that plan by having port names be looked up in a per-task table
(sensible when you think about it).

In this commit, we use mach IPC messages as the system was designed, to
transfer the send right for the semaphore from the server to the client.
It works something like this:

* Server creates IPC port and registers it globally via `bootstrap_register`
* Server listens on IPC port for messages
* Client looks up IPC port via `bootstrap_look_up`
* Client sends it a message
* Server replies with a message containing a send right to the
semaphore's port
* Client is then free to use the semaphore port as before.

This resolves jackaudio#784.
p--b added a commit to fourieraudio/jack2 that referenced this pull request Aug 8, 2021
Previously, JackMachSemaphore would communicate the send right for the
semaphore object from the server to a client via a named service
registered via `bootstrap_register`. However, to do this, it would
register the semaphore's port as the service port directly.

In theory this ought to be fine, however in practice, macOS `launchd`,
which provides the `bootstrap_register` interface, does not correctly
detect when such a port becomes dead, and incorrectly believes that the
service that it provides is forever alive, even past the end of the
`jackd` process' (and therefore the semaphore's) existence. This seems
to be *specific* to semaphore ports, as `launchd` is expecting a
standard IPC port, owned by the task, not the kernel. This prevents
`jackd` from later registering another service with the same name, as
`launchd` rejects the registration as conflicting with an active service.

To get around this, `jackd` previously added a counter to the end of the
named service registrations, allowing old services to remain in the
system until the end of the session. To prevent things getting out of
hand, this was capped at 98 service registrations for a given semaphore
name. This led to jackaudio#784, in which running a client for the 99th time
resulted in the semaphore creation failing and the client failing to
connect.

As `launchd` outlives multiple runs of `jackd`, this situation persisted
across restarts of `jackd`, requiring a restart of the user's session
(i.e. a reboot) to fix.

An initial attempt at fixing this (see jackaudio#785) tried passing the port
rights directly via shared memory, however mach is too clever for us and
foils that plan by having port names be looked up in a per-task table
(sensible when you think about it).

In this commit, we use mach IPC messages as the system was designed, to
transfer the send right for the semaphore from the server to the client.
It works something like this:

* Server creates IPC port and registers it globally via `bootstrap_register`
* Server listens on IPC port for messages
* Client looks up IPC port via `bootstrap_look_up`
* Client sends it a message
* Server replies with a message containing a send right to the
semaphore's port
* Client is then free to use the semaphore port as before.

This resolves jackaudio#784.
p--b added a commit to fourieraudio/jack2 that referenced this pull request Aug 9, 2021
Previously, JackMachSemaphore would communicate the send right for the
semaphore object from the server to a client via a named service
registered via `bootstrap_register`. However, to do this, it would
register the semaphore's port as the service port directly.

In theory this ought to be fine, however in practice, macOS `launchd`,
which provides the `bootstrap_register` interface, does not correctly
detect when such a port becomes dead, and incorrectly believes that the
service that it provides is forever alive, even past the end of the
`jackd` process' (and therefore the semaphore's) existence. This seems
to be *specific* to semaphore ports, as `launchd` is expecting a
standard IPC port, owned by the task, not the kernel. This prevents
`jackd` from later registering another service with the same name, as
`launchd` rejects the registration as conflicting with an active service.

To get around this, `jackd` previously added a counter to the end of the
named service registrations, allowing old services to remain in the
system until the end of the session. To prevent things getting out of
hand, this was capped at 98 service registrations for a given semaphore
name. This led to jackaudio#784, in which running a client for the 99th time
resulted in the semaphore creation failing and the client failing to
connect.

As `launchd` outlives multiple runs of `jackd`, this situation persisted
across restarts of `jackd`, requiring a restart of the user's session
(i.e. a reboot) to fix.

An initial attempt at fixing this (see jackaudio#785) tried passing the port
rights directly via shared memory, however mach is too clever for us and
foils that plan by having port names be looked up in a per-task table
(sensible when you think about it).

In this commit, we use mach IPC messages to transfer the send right for
the semaphore from the server to the client. By registering a standard
IPC port with the bootstrap server, the service registrations are
correctly torn down when the ports are destroyed.

It works something like this:

* Server creates IPC port and registers it globally via `bootstrap_register`
* Server listens on IPC port for messages
* Client looks up IPC port via `bootstrap_look_up`
* Client sends it a message
* Server replies with a message containing a send right to the
semaphore's port
* Client is then free to use the semaphore port as before.

This resolves jackaudio#784.
falkTX pushed a commit that referenced this pull request Aug 14, 2021
* macOS: Pass JackMachSemaphore send right via mach_msg IPC

Previously, JackMachSemaphore would communicate the send right for the
semaphore object from the server to a client via a named service
registered via `bootstrap_register`. However, to do this, it would
register the semaphore's port as the service port directly.

In theory this ought to be fine, however in practice, macOS `launchd`,
which provides the `bootstrap_register` interface, does not correctly
detect when such a port becomes dead, and incorrectly believes that the
service that it provides is forever alive, even past the end of the
`jackd` process' (and therefore the semaphore's) existence. This seems
to be *specific* to semaphore ports, as `launchd` is expecting a
standard IPC port, owned by the task, not the kernel. This prevents
`jackd` from later registering another service with the same name, as
`launchd` rejects the registration as conflicting with an active service.

To get around this, `jackd` previously added a counter to the end of the
named service registrations, allowing old services to remain in the
system until the end of the session. To prevent things getting out of
hand, this was capped at 98 service registrations for a given semaphore
name. This led to #784, in which running a client for the 99th time
resulted in the semaphore creation failing and the client failing to
connect.

As `launchd` outlives multiple runs of `jackd`, this situation persisted
across restarts of `jackd`, requiring a restart of the user's session
(i.e. a reboot) to fix.

An initial attempt at fixing this (see #785) tried passing the port
rights directly via shared memory, however mach is too clever for us and
foils that plan by having port names be looked up in a per-task table
(sensible when you think about it).

In this commit, we use mach IPC messages to transfer the send right for
the semaphore from the server to the client. By registering a standard
IPC port with the bootstrap server, the service registrations are
correctly torn down when the ports are destroyed.

It works something like this:

* Server creates IPC port and registers it globally via `bootstrap_register`
* Server listens on IPC port for messages
* Client looks up IPC port via `bootstrap_look_up`
* Client sends it a message
* Server replies with a message containing a send right to the
semaphore's port
* Client is then free to use the semaphore port as before.

This resolves #784.

* Improve error handling

* Add myself to Authors
falkTX pushed a commit that referenced this pull request Jan 15, 2022
* macOS: Pass JackMachSemaphore send right via mach_msg IPC

Previously, JackMachSemaphore would communicate the send right for the
semaphore object from the server to a client via a named service
registered via `bootstrap_register`. However, to do this, it would
register the semaphore's port as the service port directly.

In theory this ought to be fine, however in practice, macOS `launchd`,
which provides the `bootstrap_register` interface, does not correctly
detect when such a port becomes dead, and incorrectly believes that the
service that it provides is forever alive, even past the end of the
`jackd` process' (and therefore the semaphore's) existence. This seems
to be *specific* to semaphore ports, as `launchd` is expecting a
standard IPC port, owned by the task, not the kernel. This prevents
`jackd` from later registering another service with the same name, as
`launchd` rejects the registration as conflicting with an active service.

To get around this, `jackd` previously added a counter to the end of the
named service registrations, allowing old services to remain in the
system until the end of the session. To prevent things getting out of
hand, this was capped at 98 service registrations for a given semaphore
name. This led to #784, in which running a client for the 99th time
resulted in the semaphore creation failing and the client failing to
connect.

As `launchd` outlives multiple runs of `jackd`, this situation persisted
across restarts of `jackd`, requiring a restart of the user's session
(i.e. a reboot) to fix.

An initial attempt at fixing this (see #785) tried passing the port
rights directly via shared memory, however mach is too clever for us and
foils that plan by having port names be looked up in a per-task table
(sensible when you think about it).

In this commit, we use mach IPC messages to transfer the send right for
the semaphore from the server to the client. By registering a standard
IPC port with the bootstrap server, the service registrations are
correctly torn down when the ports are destroyed.

It works something like this:

* Server creates IPC port and registers it globally via `bootstrap_register`
* Server listens on IPC port for messages
* Client looks up IPC port via `bootstrap_look_up`
* Client sends it a message
* Server replies with a message containing a send right to the
semaphore's port
* Client is then free to use the semaphore port as before.

This resolves #784.

* Improve error handling

* Add myself to Authors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

macOS: clients can only connect 98 times per boot
2 participants