Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert hit in ttl_check_interval.rs #101

Closed
glademiller opened this issue Mar 9, 2020 · 7 comments
Closed

Assert hit in ttl_check_interval.rs #101

glademiller opened this issue Mar 9, 2020 · 7 comments

Comments

@glademiller
Copy link

https://github.com/blackbeam/mysql_async/blob/master/src/conn/pool/ttl_check_inerval.rs#L70

right: 0', /usr/local/cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/mysql_async-0.22.1/src/conn/pool/ttl_check_inerval.rs:70:13

left: 160

@blackbeam
Copy link
Owner

@jonhoo, looks like this assert is the part of #65. Do you have any thoughts?

@jonhoo
Copy link
Contributor

jonhoo commented Mar 10, 2020

Hmm.. This suggests that there are available connections in the pool, and yet for some reason there are threads waiting for connections. That doesn't seem right at all.

@glademiller Under what circumstances did this error occur? Is it reproducible? Could you try to come up with a simple test case where the error presents itself?

@glademiller
Copy link
Author

Any thoughts on what might be the cause. I have only seen this on an environment where the most I might be able to do is add some debug logs. My staging environment with a lot less going on hasn't hit the problem at all. In case it is important I have a pool that I clone between an actix-web service and background processing queue. If there is something I can put in the logs that would be helpful for debug please let me know.

@glademiller
Copy link
Author

Oh another bit of context this only started occurring after I upgraded to 0.22 and setting con_ttl and the inactive connection ttl and the ttl_check_interval.

@jonhoo
Copy link
Contributor

jonhoo commented Mar 10, 2020

@glademiller Yeah, it makes sense you only see it in 0.22, since the assertion came with the rewrite of the pool logic in #92. I spent some time tracing through the logic of the code, and I think I know what's going on. The issue is specifically in this code:

exchange.available.push_back($conn.into());
if let Some(w) = exchange.waiting.pop_front() {
w.wake();
}

We push to available and wake a thread that will then pick it up, but it won't pick it up yet. So if there are many futures waiting for a connection, and one becomes free, there will be a short period of time during which there is a connection in available and there are futures in waiting. @blackbeam I think the assertion is simply wrong and should be removed. We could keep a count of "notified" and try to assert using that, but it's not worth the effort.

@blackbeam
Copy link
Owner

Assert removed in cd214f1. @glademiller, is it possible to run this revision in your environment (as a git dependency)?

@blackbeam
Copy link
Owner

I've modeled Pool, Recycler and TtlCheckInterval behavior in TLA+ and only found minor issue with exchange.exist counter, so I'm closing this for now.

There is no 100% guarantee that my model matches the implementation, so feel free to reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants