Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

signalling for hole punching #1168

Merged
merged 17 commits into from
Sep 8, 2021
Merged

Conversation

marten-seemann
Copy link
Contributor

@marten-seemann marten-seemann commented Aug 21, 2021

This is a copy of #1057, with a cleaned up commit history. I intend to continue working on this PR, and eventually close #1057.
Closes #1057.

Note: This PR targets the hole-punching branch, which is where we'll assemble all hole-punching related changes before merging into master.

@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch from 75ac61e to 5cf36a6 Compare August 24, 2021 14:43
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch 3 times, most recently from 7cc8b06 to 667bf78 Compare August 29, 2021 11:01
@marten-seemann marten-seemann marked this pull request as ready for review August 29, 2021 11:04
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch 2 times, most recently from 25c5b0a to 997a2a0 Compare August 30, 2021 09:51
@marten-seemann marten-seemann changed the base branch from master to hole-punching August 30, 2021 09:51
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch from 997a2a0 to cac4a42 Compare August 30, 2021 09:53
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch 4 times, most recently from ebe2411 to 69573b9 Compare August 31, 2021 10:25
@marten-seemann
Copy link
Contributor Author

Unfortunately, the changes in libp2p/go-libp2p-autonat#109 were not enough to prevent race conditions here, as for setting up the relay connection, we need to mark all addresses as public, and then for hole punching we need to mark these addresses as private again, while the host is running. This is inherently racy.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly coding things, but there are some design questions around when to initiate a hole-punch and when to bail.

go.mod Outdated Show resolved Hide resolved
if !forceDirect {
if h.Network().Connectedness(pi.ID) == network.Connected {
return nil
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we need some way to check for connections with specific properties, right? Or do we just want to always dial in this case anyways.

@@ -208,6 +215,13 @@ func NewHost(ctx context.Context, n network.Network, opts *HostOpts) (*BasicHost
return nil, fmt.Errorf("failed to create Identify service: %s", err)
}

if opts.EnableHolePunching {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future: Dependency injection would make this a lot less invasive.

p2p/protocol/holepunch/tracer.go Show resolved Hide resolved
p2p/protocol/holepunch/pb/holepunch.proto Show resolved Hide resolved
p2p/protocol/holepunch/coordination.go Outdated Show resolved Hide resolved
p2p/protocol/holepunch/coordination.go Outdated Show resolved Hide resolved
p2p/protocol/holepunch/coordination.go Outdated Show resolved Hide resolved
p2p/protocol/holepunch/coordination.go Outdated Show resolved Hide resolved

// Hole punch if it's an inbound proxy connection.
// If we already have a direct connection with the remote peer, this will be a no-op.
if dir == network.DirInbound && isRelayAddress(v.RemoteMultiaddr()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably bail early if we're known to be dialable. This would largely mitigate a lot of DoS concerns.

Also, is there some discussion on picking a side to initiate? On one hand, having the receiver initiate gives them a chance to try a "normal" dial. On the other hand, it means the receiver may do a lot of work on behalf of a potentially malicious party.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there some discussion on picking a side to initiate? On one hand, having the receiver initiate gives them a chance to try a "normal" dial. On the other hand, it means the receiver may do a lot of work on behalf of a potentially malicious party.

The node behind the NAT is the one dialing. The reason for that is that there's a good change that the peer is not behind a NAT, and we can just dial him directly. Only if the peer is not dialable, we have to fall back to hole punching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably bail early if we're known to be dialable. This would largely mitigate a lot of DoS concerns.

Then the remote address wouldn't be a relay address, would it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, we're looking at the remote address being a relay address, AND the connection being an inbound connection.
If we know to be publicly dialable, we shouldn't be behind a relay in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node behind the NAT is the one dialing. The reason for that is that there's a good change that the peer is not behind a NAT, and we can just dial him directly. Only if the peer is not dialable, we have to fall back to hole punching.

Makes sense.

If we know to be publicly dialable, we shouldn't be behind a relay in the first place.

That doesn't stop someone from making a relayed connection. We'll accept those regardless (there are some valid use-cases for this, e.g., no common transport).

To know if we're publicly dialable, we need to check our "nat" status (subscribe to the reachability event).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Let's do that in a separate PR (this PR is targeting a feature-branch anyway).

@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch 2 times, most recently from f4e9663 to ce0878c Compare September 3, 2021 16:24
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch from f7d9920 to 18045c9 Compare September 4, 2021 11:18
p2p/protocol/holepunch/coordination.go Outdated Show resolved Hide resolved
p2p/protocol/holepunch/coordination.go Outdated Show resolved Hide resolved
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch 2 times, most recently from 54861f8 to 0ed5f82 Compare September 7, 2021 18:10
@marten-seemann marten-seemann force-pushed the hole-punching-signaling branch from 0ed5f82 to 84900aa Compare September 7, 2021 18:16
// Close closes the Hole Punch Service.
func (hs *Service) Close() error {
hs.closeMx.Lock()
hs.closed = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to make this idempotent (not critical, but it's something I generally expect from close methods). I.e., if already closed, skip to the Wait call.

Really, everything here should already be idempotent, but the tracer's close method may not be.

Copy link
Contributor Author

@marten-seemann marten-seemann Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is idempotent, Wait will return immediately if the counter is already 0. Otherwise, both (concurrent) calls to Close will wait for the Wait to return, which technically is the more correct way to do this, if you want to interpret "Close returned" as "this service is now properly closed".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. My point is that the tracer's close function could get called multiple times, and it may not be expecting that.

Of course, it should just "deal" with it, so we're probably fine.

Comment on lines +108 to +109
hpCtx := network.WithUseTransient(hs.ctx, "hole-punch")
sCtx := network.WithNoDial(hpCtx, "hole-punch")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super-nit: store this context and reuse it.

@marten-seemann marten-seemann merged commit 8206fcd into hole-punching Sep 8, 2021
@marten-seemann marten-seemann mentioned this pull request Sep 8, 2021
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants