Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework peer tracking logic to handle multiple connections #132

Merged
merged 5 commits into from
Dec 13, 2018

Conversation

vyzo
Copy link
Collaborator

@vyzo vyzo commented Dec 13, 2018

Multiple connections are simply not handled at all by the current code; see #130.

This is an attempt to rework the peer tracking logic so that multiple connections are handled gracefully.

@ghost ghost assigned vyzo Dec 13, 2018
@ghost ghost added the in progress label Dec 13, 2018
@vyzo vyzo requested a review from bigs December 13, 2018 15:00
@vyzo
Copy link
Collaborator Author

vyzo commented Dec 13, 2018

cc @cheatfate

@vyzo vyzo force-pushed the fix/peer-tracking branch from 234bef0 to 2621f89 Compare December 13, 2018 17:52
@vyzo vyzo force-pushed the fix/peer-tracking branch from be6d7e4 to fc7795c Compare December 13, 2018 20:52
@Stebalien
Copy link
Member

I believe this is still going to be pretty racy:

Hello Packet

An old handleSendingMessage can stick around after we replace an old connection. This old handler can read the hello packet and throw it into oblivion.

Fixes:

  1. handleSendingMessage should probably request a hello packet directly instead of having it sent on the general outgoing channel (that could be shared).
  2. Don't share the outgoing channel. That is, when we start a new peer handler, we should close the old outgoing channel and open a new one. This will kill off the old handleSendingMessage handler.

Just 2 may be sufficient but I'm not sure.


This will also get rid of that sleep...

@vyzo
Copy link
Collaborator Author

vyzo commented Dec 13, 2018

2 should be sufficient I think, will update.

if p.host.Network().Connectedness(pid) == inet.Connected {
// still connected, must be a duplicate connection being closed.
// we respawn the writer as we need to ensure there is a stream active
log.Warning("peer declared dead but still connected; respawning writer: ", pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a chance we'll hit this frequently the connection may close before we realize we're disconnected. It may not be ab issue but we should watch for this.

p.rt.AddPeer(pid, s.Protocol())

case pid := <-p.newPeerError:
delete(p.peers, pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a race here where:

  1. We have an old zombie connection that doesn't know it's dead yet.
  2. We get a new connection.

With the current logic, we may still try to open a stream using the old connection. We'll then never try to create one with the new connection.

Unfortunately, this is a pretty common situation when a connection dies and we replace it immediately.

Maybe we should drop this channel and send a notification along peerDead instead? If we still think we're connected, we'll continue to retry until we realize we're actually disconnected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used when the stream failed to open; we can't pass it through the peerDead channel because the peer may simply not support the pubsub protocols.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is if the stream fails to open because the connection was closed (if it was a duplicate connection).
Is there a way to discriminate whether the failure was an unsupported protocol?
In this case we can notify on peerError for unsupported protocol and peerDead for connection closed errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

named error in go-multistream.ErrNotSupported can be used for this per discussion on irc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@Stebalien
Copy link
Member

(why does connection state logic have to be this damn hard...)

@vyzo vyzo merged commit a12c523 into master Dec 13, 2018
@ghost ghost removed the in progress label Dec 13, 2018
@vyzo vyzo deleted the fix/peer-tracking branch December 13, 2018 22:47
@vyzo vyzo mentioned this pull request Dec 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants