-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
signalling for hole punching #1168
Conversation
75ac61e
to
5cf36a6
Compare
7cc8b06
to
667bf78
Compare
25c5b0a
to
997a2a0
Compare
19c507a
to
b0ee082
Compare
997a2a0
to
cac4a42
Compare
b0ee082
to
2d07a0d
Compare
ebe2411
to
69573b9
Compare
Unfortunately, the changes in libp2p/go-libp2p-autonat#109 were not enough to prevent race conditions here, as for setting up the relay connection, we need to mark all addresses as public, and then for hole punching we need to mark these addresses as private again, while the host is running. This is inherently racy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly coding things, but there are some design questions around when to initiate a hole-punch and when to bail.
if !forceDirect { | ||
if h.Network().Connectedness(pi.ID) == network.Connected { | ||
return nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we need some way to check for connections with specific properties, right? Or do we just want to always dial in this case anyways.
@@ -208,6 +215,13 @@ func NewHost(ctx context.Context, n network.Network, opts *HostOpts) (*BasicHost | |||
return nil, fmt.Errorf("failed to create Identify service: %s", err) | |||
} | |||
|
|||
if opts.EnableHolePunching { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future: Dependency injection would make this a lot less invasive.
|
||
// Hole punch if it's an inbound proxy connection. | ||
// If we already have a direct connection with the remote peer, this will be a no-op. | ||
if dir == network.DirInbound && isRelayAddress(v.RemoteMultiaddr()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably bail early if we're known to be dialable. This would largely mitigate a lot of DoS concerns.
Also, is there some discussion on picking a side to initiate? On one hand, having the receiver initiate gives them a chance to try a "normal" dial. On the other hand, it means the receiver may do a lot of work on behalf of a potentially malicious party.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is there some discussion on picking a side to initiate? On one hand, having the receiver initiate gives them a chance to try a "normal" dial. On the other hand, it means the receiver may do a lot of work on behalf of a potentially malicious party.
The node behind the NAT is the one dialing. The reason for that is that there's a good change that the peer is not behind a NAT, and we can just dial him directly. Only if the peer is not dialable, we have to fall back to hole punching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably bail early if we're known to be dialable. This would largely mitigate a lot of DoS concerns.
Then the remote address wouldn't be a relay address, would it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, we're looking at the remote address being a relay address, AND the connection being an inbound connection.
If we know to be publicly dialable, we shouldn't be behind a relay in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The node behind the NAT is the one dialing. The reason for that is that there's a good change that the peer is not behind a NAT, and we can just dial him directly. Only if the peer is not dialable, we have to fall back to hole punching.
Makes sense.
If we know to be publicly dialable, we shouldn't be behind a relay in the first place.
That doesn't stop someone from making a relayed connection. We'll accept those regardless (there are some valid use-cases for this, e.g., no common transport).
To know if we're publicly dialable, we need to check our "nat" status (subscribe to the reachability event).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Let's do that in a separate PR (this PR is targeting a feature-branch anyway).
f4e9663
to
ce0878c
Compare
f7d9920
to
18045c9
Compare
54861f8
to
0ed5f82
Compare
0ed5f82
to
84900aa
Compare
// Close closes the Hole Punch Service. | ||
func (hs *Service) Close() error { | ||
hs.closeMx.Lock() | ||
hs.closed = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to make this idempotent (not critical, but it's something I generally expect from close methods). I.e., if already closed, skip to the Wait
call.
Really, everything here should already be idempotent, but the tracer's close method may not be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is idempotent, Wait
will return immediately if the counter is already 0. Otherwise, both (concurrent) calls to Close
will wait for the Wait
to return, which technically is the more correct way to do this, if you want to interpret "Close
returned" as "this service is now properly closed".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. My point is that the tracer's close function could get called multiple times, and it may not be expecting that.
Of course, it should just "deal" with it, so we're probably fine.
hpCtx := network.WithUseTransient(hs.ctx, "hole-punch") | ||
sCtx := network.WithNoDial(hpCtx, "hole-punch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super-nit: store this context and reuse it.
This is a copy of #1057, with a cleaned up commit history. I intend to continue working on this PR, and eventually close #1057.
Closes #1057.
Note: This PR targets the
hole-punching
branch, which is where we'll assemble all hole-punching related changes before merging intomaster
.