Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pubsub: flow control creates a lot of RuntimeErrors #4288

Closed
arthurdarcet opened this issue Oct 31, 2017 · 6 comments
Closed

pubsub: flow control creates a lot of RuntimeErrors #4288

arthurdarcet opened this issue Oct 31, 2017 · 6 comments
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release.

Comments

@arthurdarcet
Copy link

When the thread policy gets a callback request for "lease", BasePolicy.lease is called from the request consumer thread.

If this lease call determines that there are too many messages currently being handled, it tries to stop the consumer thread. But this is done by calling helper_threads.stop('callback requests worker') which in turns tries to join the "callback requests worker" thread (after having added a STOP item to its queue).
This raises a RuntimeError: cannot join current thread, which is caught by the QueueCallbackThread and logged. I don't think the error in itself doesn't cause any harm because the subscription is then re-opened once the load is low enough and the helper threads gets restarted.

It's still ugly though because the end of the helper_thread.stop(…) call is never executed and the self._helper_threads dict stil contains the dead thread.

@dhermes dhermes added the api: pubsub Issues related to the Pub/Sub API. label Oct 31, 2017
@lukesneeringer lukesneeringer self-assigned this Nov 3, 2017
@lukesneeringer
Copy link
Contributor

@arthurdarcet Thanks for reporting. I clearly did not do nearly enough testing around flow control. I will see about getting a fix in as soon as possible.

@lukesneeringer lukesneeringer added the priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. label Nov 3, 2017
@arthurdarcet
Copy link
Author

@lukesneeringer Thanks. We actually encountered another way more serious issue:
The flow control was not working at all here: when we subscribed to a topic with ~5M messages on hold, the process would fill our RAM very fast, and ignore the specified flow control.

I haven't opened another issue for this, as I'm not exactly sure where we went wrong. We ended up switching to Go for this (very small) project, and we did not look too much into this issue

@lukesneeringer
Copy link
Contributor

(Status update: Still working on this and related flow control issues.)

@dhermes
Copy link
Contributor

dhermes commented Nov 30, 2017

Thanks for filing @arthurdarcet. This issue is mostly handled by #4498 (i.e. a thread won't try to join itself).

However, I'd still like a reproducible case where the load exceeds the threshold, just so we can kick all the tires.

Check out https://pypi.org/project/google-cloud-pubsub/0.29.2/ though, it has the fix from #4498 in it.

@dhermes
Copy link
Contributor

dhermes commented Dec 11, 2017

@arthurdarcet Can you try out 0.29.4 and see if the flow control issues are resolved?

UPDATE: Sorry I realize your team switched to the Go client, so disregard.

@dhermes
Copy link
Contributor

dhermes commented Dec 11, 2017

I am closing since the "flow control breaks because a thread can't join() itself" has been resolved by #4558 mostly (though see 0.29.3 and 0.29.4 release notes for related).

The "process would fill our RAM very fast" issue has been reported elsewhere, so there is no need to leave this issue open. (I'm still trying to reproduce that issue, or at least to do it without filling up a topic with 5M unhandled messages. I hope to be able to soon.)

@dhermes dhermes closed this as completed Dec 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release.
Projects
None yet
Development

No branches or pull requests

3 participants