-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug where peers connecting to one another simultaneously never establish a pubsub stream #105
Conversation
When initializing a PubSub connection to another peer, we now inspect the peer ID of the peer that initiated a `Conn`. When encountering subsequent connections to a peer, we measure (in integral space) the initiating peer ID of THAT connection against the existing connection. If, and only if, that peer ID is GREATER THAN the existing initiator peer ID, we close our old stream and open a stream on that new connection. This unambiguous logic ensures that both peers maintain the same stream, instead of closing their stream (the first to notify in a simultaneous connect scenario) and favoring the other peer's stream, which the other peer will in turn close.
in particular, i could see storing an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this very ugly...
pubsub.go
Outdated
@@ -86,7 +87,7 @@ type PubSub struct { | |||
// eval thunk in event loop | |||
eval chan func() | |||
|
|||
peers map[peer.ID]chan *RPC | |||
peers map[peer.ID]*rpcpair |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't need to change this type (and the floodsub/gossipsub code) at all.
Let's just keep the connection with the highest ID in key space. |
tagged as blocked until we can decide on a better fix at swarm level |
@Stebalien this is a kind of complicated issue. connection de-duplication only seems to happen at the |
Seems it's an instance of this: libp2p/go-libp2p-swarm#79 |
@vyzo same deal but not quite the same, as far as I know. However, in this case, I think we can just use two streams, right? We could also disambiguate but IMO, two streams is the correct approach. |
yeah that’s my plan. should we assume users will deal with reconnection?
…On Fri, Sep 14, 2018 at 16:43 Steven Allen ***@***.***> wrote:
@vyzo <https://github.com/vyzo> same deal but not quite the same, as far
as I know. However, in *this* case, I think we can just use two streams,
right? We could *also* disambiguate but IMO, two streams is the correct
approach.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#105 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AANBWvfQpk574iou38M_wGX6nSdC2kcGks5ubBTcgaJpZM4WpqTc>
.
|
Reconnection? In this case, I'd just have both sides use their own stream for sending, that should "just work". |
closing this as it's plain wrong! |
This patch establishes a failing test and then provides patch for a subtle bug, where two peers connecting to one another at the same time fail to establish an active pubsub stream. This bug is rooted in the fact that both peers tend to get a notification that their OUTBOUND dial succeeds before their peer's INBOUND dial. As a result of the old logic, both peers would then happily throw away the stream they established on their outbound connection to favor the peer's inbound connection, resulting in both peers holding on to bad streams.
This fix is a bit heavy handed, but the logic was the most sound I could muster. My patch changes behavior such that peers favor a connection initiated by the peer with the greatest peer ID (evaluated in integral space).
This branch closes #93.