our TCP listener is nonblocking so we must set the accepted stream to blocking explicitly #3154

antiochp · 2019-11-30T22:28:37Z

A nonblocking TCP listener will accept nonblocking TCP streams by default.
This is very surprising to put it mildly.

We were creating nonblocking streams when accepting inbound connections from peers, but creating blocking streams when connecting to outbound peers.
Which made this problem really hard to debug and really hard to track down.

You can see this if you add debug logging in try_break! when matching on WouldBlock. On some connections we were seeing these scroll by at a rapid rate, but not all connections.
It turns out it was only on inbound connections.

So you can only reproduce the issue on a public node, or locally with a couple of nodes on localhost.

This probably explains all kinds of weird connection related problems.

One example: if we request a txhashset from a public node and we are an inbound connection from the perspective of that node then the nonblocking stream is going to prevent that node from reliably sending the full txhashset attachment back un-corrupted.

…to blocking explicitly...

DavidBurkett · 2019-11-30T22:32:39Z

Amazing. I would've never thought that would happen and not fail catastrophically somewhere. Awesome find!

antiochp · 2019-11-30T22:36:44Z

I suspect this also explains part of the slow header sync - if you request headers from an outbound peer then they are sending you a reasonably large chunk of bytes via a write_all which will be susceptible to a WouldBlock internally as its on a nonblocking stream but we assume it is blocking and safe to treat this way.
End result is the receiving end may not get a full Headers msg back in some scenarios and be forced to time the connection out as it is blocking on the receiving end.

antiochp · 2019-11-30T22:38:25Z

I have successfully sync'd against a node running this code several times in a row without any issues.

DavidBurkett · 2019-11-30T22:42:35Z

Once we get this merged, I'll ask everyone I know of that runs public nodes to build from master. I'm aware of dozens of people that simply can't use Grin without this.

hashmap · 2019-11-30T23:27:14Z

@antiochp do you have a link to the docs or source code?

I did it in the beginning a while ago and then removed it out of curiosity. Nothing changed. Also I run a public node and don’t see any issues, moreover I could not reproduce txhashet issue for 2 weeks. It would also cause high cpu load and my node runs on 0.3-1% cpu having at least a dozen of inbound peers. Having said this Gary experienced cpu spikes and fixed it adding a sleep after WouldBlock. The only explanation I have is that my node has a very good connection or extremely lucky.

hashmap · 2019-11-30T23:47:01Z

I'm fine with the change. After reading an example in rust docs I agree with your idea, however I still don't understand how it works on my machine, I have 29 inbound peers now, even with tui I'm under 5% cpu, with non-blocking io it was at least 60%.

antiochp · 2019-12-02T10:37:51Z

@antiochp do you have a link to the docs or source code?

I actually don't no. This is all from experimenting with nodes locally (and a lot of debug logs).

But I do think that the "principle of least surprise" may apply here - 99% of the time if you have a nonblocking listener you are going to want nonblocking streams accepted from it and it would actually be surprising behavior if you were then required to set each stream to nonblocking.

Our usage is definitely an outlier - we have a nonblocking listener but we do want blocking streams accepted.

It would be nice if there was a simple way of actually reading what the current blocking/nonblocking state of a stream was, but there does not appear to be.

antiochp · 2019-12-03T11:14:39Z

Hmm.

@hashmap has been testing this under various conditions and it appears this is only an issue on MacOS.
So not entirely sure how much effect this fix will have in the real world on mainnet as presumably majority of public nodes are on Linux.

MacOS accepts nonblocking connections by default from a nonblocking listener
Linux accepts blocking connections by default from a nonblocking listener
Windows - who knows?

our TCP listener is nonblocking so we *must* set the accepted stream …

24b5dbb

…to blocking explicitly...

antiochp added bug fixed labels Nov 30, 2019

antiochp added this to the 3.0.0 milestone Nov 30, 2019

antiochp requested a review from hashmap November 30, 2019 22:28

DavidBurkett approved these changes Nov 30, 2019

View reviewed changes

hashmap approved these changes Nov 30, 2019

View reviewed changes

hashmap merged commit 7f7d51a into mimblewimble:master Nov 30, 2019

antiochp deleted the explicit_blocking branch December 2, 2019 10:29

antiochp mentioned this pull request Dec 2, 2019

[DNM] do not retry the txhashset msg send (do not retry any msg with an attachment) #3152

Closed

garyyu mentioned this pull request Dec 4, 2019

notify the download failure so as to quickly restart a new request gottstech/gotts#34

Merged

hashmap mentioned this pull request Dec 10, 2019

CPU usage 99% on Windows Server 2016 #3114

Closed

antiochp mentioned this pull request Dec 11, 2019

3.0.0 Changelog #3166

Closed

antiochp mentioned this pull request Feb 24, 2020

100% cpu load on all nodes #3237

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

our TCP listener is nonblocking so we must set the accepted stream to blocking explicitly #3154

our TCP listener is nonblocking so we must set the accepted stream to blocking explicitly #3154

antiochp commented Nov 30, 2019 •

edited

Loading

DavidBurkett commented Nov 30, 2019

antiochp commented Nov 30, 2019

antiochp commented Nov 30, 2019

DavidBurkett commented Nov 30, 2019

hashmap commented Nov 30, 2019

hashmap commented Nov 30, 2019

antiochp commented Dec 2, 2019 •

edited

Loading

antiochp commented Dec 3, 2019

our TCP listener is nonblocking so we *must* set the accepted stream to blocking explicitly #3154

our TCP listener is nonblocking so we *must* set the accepted stream to blocking explicitly #3154

Conversation

antiochp commented Nov 30, 2019 • edited Loading

DavidBurkett commented Nov 30, 2019

antiochp commented Nov 30, 2019

antiochp commented Nov 30, 2019

DavidBurkett commented Nov 30, 2019

hashmap commented Nov 30, 2019

hashmap commented Nov 30, 2019

antiochp commented Dec 2, 2019 • edited Loading

antiochp commented Dec 3, 2019

our TCP listener is nonblocking so we must set the accepted stream to blocking explicitly #3154

our TCP listener is nonblocking so we must set the accepted stream to blocking explicitly #3154

antiochp commented Nov 30, 2019 •

edited

Loading

antiochp commented Dec 2, 2019 •

edited

Loading