-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition when client starts before server causing hanging method call #1109
Comments
@ids1024 Thanks so much for filing this. I think a colleague of mine also noticed something like this a month back but then we forgot about it. :) I'll look into this soon. Likely (hopefully) it's something small. |
btw, in zbus 5, you don't need SignalEmitter outside of method handlers. You can just do: conn
.object_server()
// You can cache the interface ref from this call.
.interface("/com/ids1024/FooBar")
.await
.baz(count)
.await? and also from within the method handlers, you can directly call the methods on the Oh and it's almost always best to make use of |
Yeah, I know it's also possible to create the object in the connection builder (but I think should behave the same?). Good to know about that for signal emitting. At some point I'll need to do more testing with the clients where I've seen issues related to restarting the server. It also seems like gdbus handles this differently than zbus, and proxies don't work when the server has restarted ( |
Not really. If the service is launched because of a method call, the method call will be sent immediately after the service registers its name, so if the object server/interface has not yet been set up, the call will go unanswered. I actually first thought that's what is happening in your case but then I saw that the client actually waits for a signal from the service first and service only sends that out after it's all set up. Hence why I said, it's better in general to always use the connection builder to avoid such issues.
You mean other issues with zbus clients, that are different from this issue? 😮 |
I mean that I'm not sure this is the same issue I've seen in actual use. With luck it will have the same cause and be easy to fix. |
I was able to reproduce the issue here. I did some tracing and the issue is that the object server task is only getting launched after the method call has already been received and hence dropped (cause object server isn't there to act on it):
I'll check how this can be helped.. |
So turns out my first hunch was right on the money. The In the meantime, please use |
I think I've mostly used Looks like another issue I've seen is just due to Seems somewhat unexpected, but not too broken. Should be possible to work-around by manually dealing with |
Ah ok. Could you please create another issue for that? 🙏
yeah or |
Two issue with making
So given that and the fact that the builder pattern should be preferred (in general but especially when timing of the object server is important ) anyway, I think it's best we don't do this and instead document the strong recommendation to prefer the builder, better. Any advice on when and how to document this, would be appreciated. Perhaps a question in our FAQ? |
@ids1024 Just to remind, I'm awaiting your response here. :) |
I've noticed a few issues in method calls and signals when a DBus service restarts, or isn't initially running. (Some probably being issues with C clients connecting to my zbus-based service.) So I wrote a simple test for this.
I've noticed an interesting behavior, which may or may not be related to issues I've seen before.
Server code:
Client:
The server sends the
Baz
signal every second, and the client simply listens for this and replies withAckBaz
.If I start the client, then the server, it often (but not always) hangs at
ack...
, calling theAckBaz
method but not getting the reply. If it doesn't hang, restarting the server likely will make it do so.dbus-monitor --session
shows the method call but not response, so I guess this is a bug in the z us server? Loggingreceive_message
shows it does seem to get the method call message.The issue is inconsistent, and uncommenting the
sleep
can fix it, so it seems to be some sort of race condition? Not sure how.The text was updated successfully, but these errors were encountered: