Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

defaults: do TLS by default for encryption #2650

Merged
merged 2 commits into from
Jan 24, 2024

Conversation

Jorropo
Copy link
Contributor

@Jorropo Jorropo commented Nov 22, 2023

Tls has much better throughput, the handshake benchmark is fairly noisy, there is no significant performance difference, however it does allocate more.

goos: linux
goarch: amd64
cpu: AMD Ryzen 5 3600 6-Core Processor
BenchmarkNoise/throughput/32KiB-12 24984             46605 ns/op         703.10 MB/s          37 B/op          2 allocs/op
BenchmarkNoise/throughput/1MiB-12   1134           1459483 ns/op         718.46 MB/s         663 B/op         34 allocs/op
BenchmarkNoise/handshakes-12        1302           1054533 ns/op           32691 B/op        348 allocs/op
BenchmarkTls/throughput/32KiB-12   49006             24309 ns/op        1347.99 MB/s          50 B/op          2 allocs/op
BenchmarkTls/throughput/1MiB-12     1747            778498 ns/op        1346.92 MB/s        1603 B/op         64 allocs/op
BenchmarkTls/handshakes-12          1116           1045475 ns/op          105257 B/op       1478 allocs/op

@Jorropo
Copy link
Contributor Author

Jorropo commented Nov 22, 2023

Throughput is basically comparing go's hardware accelerated AES impl vs the software chacha one.

Jorropo added a commit to ipfs/kubo that referenced this pull request Nov 22, 2023
Jorropo added a commit to ipfs/kubo that referenced this pull request Nov 22, 2023
@marten-seemann
Copy link
Contributor

Any idea why TLS is allocating like crazy during the transfer?

@Jorropo
Copy link
Contributor Author

Jorropo commented Nov 22, 2023

So is noise, I think it could be net.Pipe's context switching (benchmark artifact).
AFAIK tls allocates once per refresh:

Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, modulo the location of the benchmarks. On my M1 Mac, the difference is quite extreme:

goos: darwin
goarch: arm64
pkg: github.com/libp2p/go-libp2p/p2p/security/tls
BenchmarkTls/throughput/32KiB-10   	   70203	     15562 ns/op	2105.59 MB/s
BenchmarkTls/throughput/1MiB-10    	    2516	    495383 ns/op	2116.70 MB/s
BenchmarkTls/handshakes-10         	    2110	    549245 ns/op
PASS
ok  	github.com/libp2p/go-libp2p/p2p/security/tls	4.445s
goos: darwin
goarch: arm64
pkg: github.com/libp2p/go-libp2p/p2p/security/noise
BenchmarkNoise/throughput/32KiB-10                             	   17421	     71990 ns/op	 455.17 MB/s
BenchmarkNoise/throughput/1MiB-10                              	     663	   1819395 ns/op	 576.33 MB/s

p2p/security/noise/transport_test.go Outdated Show resolved Hide resolved
p2p/security/tls/transport_test.go Outdated Show resolved Hide resolved
@marten-seemann
Copy link
Contributor

@Jorropo Would you be able to push this PR across the finish line, or do you want us to take it over?

Tls has much better throughput, the handshake benchmark is fairly noisy, there is no significant performance difference, however it does allocate more.

```
goos: linux
goarch: amd64
cpu: AMD Ryzen 5 3600 6-Core Processor
BenchmarkNoise/throughput/32KiB-12 24984	     46605 ns/op	 703.10 MB/s	      37 B/op	       2 allocs/op
BenchmarkNoise/throughput/1MiB-12   1134	   1459483 ns/op	 718.46 MB/s	     663 B/op	      34 allocs/op
BenchmarkNoise/handshakes-12        1302	   1054533 ns/op	   32691 B/op	     348 allocs/op
BenchmarkTls/throughput/32KiB-12   49006	     24309 ns/op	1347.99 MB/s	      50 B/op	       2 allocs/op
BenchmarkTls/throughput/1MiB-12     1747	    778498 ns/op	1346.92 MB/s	    1603 B/op	      64 allocs/op
BenchmarkTls/handshakes-12          1116           1045475 ns/op	  105257 B/op	    1478 allocs/op
```
Copy link
Member

@sukunrt sukunrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks @Jorropo!
Should we remove the asserts from the benchmarks?

@Jorropo
Copy link
Contributor Author

Jorropo commented Dec 30, 2023

The assert is extremely cheap, it ensure the benchmark is actually benchmarking as intended.

@aschmahmann
Copy link
Collaborator

aschmahmann commented Dec 30, 2023

Note: If merging this I would make sure to call out this change in the release notes so people can decide if/when they want to make the change. IIUC this will add one roundtrip to all TCP/WS/WSS connections in between go-libp2p nodes in the short term (while nodes update) and may also add one roundtrip to all of those connection types to libp2p implementations that don't support TLS (is that all the others, or is my info out of date?).

If I recall the reason Noise was made the default over TLS was that TLS was only really supported in go-libp2p and the potential for extra roundtrips weren't deemed worthwhile as a default (despite TLS being better for go-libp2p<->go-libp2p connections).

@sukunrt
Copy link
Member

sukunrt commented Dec 31, 2023

this will add one roundtrip to all TCP/WS/WSS connections in between go-libp2p nodes in the short term (while nodes update)

This shouldn't happen, I'll confirm though.
The dialer(new go-libp2p version) should eagerly write /tls now and the server should select this preference, so this should only take 1 Round Trip. In the other case the dialer(old go-libp2p version) should eagerly write /noise and the server(new go-libp2p version) should select the client's preference.

may also add one roundtrip to all of those connection types to libp2p implementations that don't support TLS

I'll point these out in the release notes.

@sukunrt sukunrt mentioned this pull request Jan 1, 2024
19 tasks
@sukunrt sukunrt merged commit e33f4c4 into libp2p:master Jan 24, 2024
11 checks passed
@Dreamacro
Copy link
Contributor

In fact, TLS is no faster than Noise and encapsulates more overhead.

The reason for the benchmark being due to Noise is that modern CPU have AES-related instruction sets, and Noise is a non-negotiated cryptographic transport layer protocol that, in libp2p, uses Noise_XX_DH25519_ChachaPoly_SHA256.

The only problem is that it causes break change

Noise_XX_DH25519_ChachaPoly_SHA256

% go test -benchmem -run="^$" -bench . ./p2p/test/security                       
goos: darwin
goarch: arm64
pkg: github.com/libp2p/go-libp2p/p2p/test/security
BenchmarkNoise/throughput/32KiB-8          15506             77886 ns/op         420.72 MB/s          32 B/op          2 allocs/op
BenchmarkNoise/throughput/1MiB-8             655           1874466 ns/op         559.40 MB/s         565 B/op         34 allocs/op
BenchmarkNoise/handshakes-8                 2043            556362 ns/op           32663 B/op        348 allocs/op
BenchmarkTLS/throughput/32KiB-8            78087             14571 ns/op        2248.79 MB/s          49 B/op          2 allocs/op
BenchmarkTLS/throughput/1MiB-8              2554            462939 ns/op        2265.04 MB/s        1581 B/op         64 allocs/op
BenchmarkTLS/handshakes-8                   2187            554662 ns/op           98050 B/op       1408 allocs/op
PASS
ok      github.com/libp2p/go-libp2p/p2p/test/security   8.877s

Noise_XX_DH25519_AESGCM_BLAKE2b

% go test -benchmem -run="^$" -bench . ./p2p/test/security
goos: darwin
goarch: arm64
pkg: github.com/libp2p/go-libp2p/p2p/test/security
BenchmarkNoise/throughput/32KiB-8          75744             15124 ns/op        2166.63 MB/s          32 B/op          2 allocs/op
BenchmarkNoise/throughput/1MiB-8            2602            455661 ns/op        2301.22 MB/s         650 B/op         34 allocs/op
BenchmarkNoise/handshakes-8                 2157            564877 ns/op           62759 B/op        406 allocs/op
BenchmarkTLS/throughput/32KiB-8            82539             14516 ns/op        2257.42 MB/s          49 B/op          2 allocs/op
BenchmarkTLS/throughput/1MiB-8              2528            461344 ns/op        2272.87 MB/s        1581 B/op         64 allocs/op
BenchmarkTLS/handshakes-8                   2161            533336 ns/op           98000 B/op       1407 allocs/op
PASS
ok      github.com/libp2p/go-libp2p/p2p/test/security   8.803s

@Jorropo
Copy link
Contributor Author

Jorropo commented Mar 19, 2024

Libp2p's noise require ChaChaPoly sha256, that would require a new spec and break compat which is bad.
We also don't have a way to negociate multiple noise suites without round trips, TLS sends all the supported suites at once and figure it out.
The go tls implementation is smart, it will dynamically prefer AES over ChaCha (or not) depending on what the CPU supports.

The cost of encapsulation seems trivial over the cost of the cipher, your Noise_XX_DH25519_AESGCM_BLAKE2b benchmark is within benchmark of TLS.

Even if TLS was too slow, crypto/tls isn't very well optimized and could be replaced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants