-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible regression in 0.2.14 and further (hangs, stack overflow) #2422
Comments
Are you by any chance using the futures v0.1 |
Nope, only 0.3 |
Well there have been a few issues with hangs after this got introduced, which exposed a collection of buggy sub-schedulers such as If you application hangs, it's likely due to such a buggy sub-scheduler somewhere. As for the stack-overflow, that sometimes happens when people try to make big stack arrays, e.g.: let mut buf = [0; 4096];
stream.read(&mut buf).await?; and stuff like this. This should be avoided in futures, because they make the future object massive, which can cause the call to Of course, it could also just be an infinite recursive loop. Your backtrace would probably tell you in that case. |
Do a snapshot of the process (thread stacks) when stuck at 100% CPU. That should show which fn it is stuck in. |
@Darksonn @carllerche Thanks for the advises! I'll test both my stack-allocated stuff and threads snapshot and return with more facts in a couple of days |
I was wondering if you had any further details on this issue? If not, I will have to close the issue due to lack of details. |
Yep, I guess let's close it. Since I've got rid of all my "custom" futures with twisted logic and shifted on all the hard work to tokio (via So my best guess for now is that I've made some mistakes in the app's logic initially, and it just "happened" to work as expected. Thanks everyone for attention and sorry I was not able to provide further details |
Version
0.2.14
and higher (up to0.2.18
so far)Platform
Linux ... 5.6.3-arch1-1 #1 SMP PREEMPT Wed, 08 Apr 2020 07:47:16 +0000 x86_64 GNU/Linux
Linux ... 5.6.5-1.el7.elrepo.x86_64 #1 SMP Thu Apr 16 14:02:22 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
Description
Disclaimer
Unfortunately, I am currently unable to provide an MRE or even a code sample (NDA) that causes the issue, but still I'm writing an issue in case somebody has experienced what I had to experience.
Problem
I'm developing a reverse-proxy-like application, and after updating
tokio
from0.2.13
to0.2.18
I've found that my app hangs, while consuming 100% of a CPU core (out of many cores). As I mentioned before, I can not disclose all the details, but in general the app does the following things:Under the hood, I use
FuturesUnordered
andSelect
s fromfutures-util
and do a lot of polls manually in the order I found to be most suitable. I don'tspawn
anything and use default tokio runtime (via#[tokio::main]
macro).After upgrading to
tokio-0.2.18
I've found that my app established about 100 connections to the internal servers and hangs completely, consuming 100% of a CPU core. All the attempts establish a connect to the port it listens to fail because of timeout.I though then "okay, there's being a major upgrade in tokio's scheduler in 0.12.14, so probably i must not manage all the futures myself and just spawn the tasks!", so I've replaced
FuturesUnordered
andSelect
s with spawns and yay! It seemed to have solved the whole issue... until a lot of "internal" servers went offline and ...... the connections where scheduled to be re-established and I've got a stack-overflow error.
So I had to downgrade to
tokio-0.2.13
, where everything just works (tm).My question is, how do I investigate the root cause of my issue? Where should I look at first?
Eventually, I would like to provide an MRE, but so far It's just a cry for help :)
Thanks!
The text was updated successfully, but these errors were encountered: