fix(iroh-net): Fix some flaky magicsock tests #2034

flub · 2024-02-20T12:28:57Z

Description

Fix flaky tests in iroh-net (hopefully). The main issue was that the
testing code meshing together the sockets had a race-condition and
would sometimes fail.

Other improvements:

Logging improvements of various parts. These tests should now
provide better info on failure.
Add alternative formatting of NodeId to use the short format.
Use NodeId in the public API, rather than PublicKey.
Move multi-threaded logging setup to the testing utils crate now
that it can work properly. More code will have to switch.

Notes & open questions

The local_endpoints_change API is really difficult to use correctly.
Even the way this testing code uses it now is wrong. Instead of the
two current APIs there should be a single API which returns a stream.
I'll do this as a separate followup PR though.

Change checklist

Self-review.
Documentation updates if relevant.
Tests if relevant.

This works for me, but used to be flaky. The way it is setup now should give use useful logs in case it fails.

this is somewhat more useful now that we use nextest

This is likely the bug that made all of these tests flaky. Let's see.

dignifiedquire · 2024-02-21T13:10:39Z

iroh-net/src/key.rs

@@ -227,7 +227,11 @@ impl Debug for PublicKey {

 impl Display for PublicKey {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
-        write!(f, "{}", base32::fmt(self.as_bytes()))
+        if f.alternate() {


when/how is this triggered?

When you use a # formatter: {:#} (or {:#?} for Debug)

shouldn’t it be the other way around then,? # prints longer more detailed messages

I could be rather tempted to switch this around and do the short format by default and the long format for alternate. That would reduce rather a lot of verbosity in our code.

the issue is that, even externally, we rely on to_string() roundtriping

yeah, it'd probably break things if this can't be round-tripped. so maybe we'll have to settle for this.

## Description Fix flaky tests in iroh-net (hopefully). The main issue was that the testing code meshing together the sockets had a race-condition and would sometimes fail. Other improvements: - Logging improvements of various parts. These tests should now provide better info on failure. - Add alternative formatting of NodeId to use the short format. - Use NodeId in the public API, rather than PublicKey. - Move multi-threaded logging setup to the testing utils crate now that it can work properly. More code will have to switch. ## Notes & open questions The local_endpoints_change API is really difficult to use correctly. Even the way this testing code uses it now is wrong. Instead of the two current APIs there should be a single API which returns a stream. I'll do this as a separate followup PR though. ## Change checklist - [x] Self-review. - [x] Documentation updates if relevant. - [x] Tests if relevant.

flub added 8 commits February 20, 2024 13:28

Facelift the magicsock roundtrip test

ae1448c

This works for me, but used to be flaky. The way it is setup now should give use useful logs in case it fails.

Merge branch 'main' into flub/test-magicsock-roundtrip

7515f45

move setup_multithreaded_logging to iroh test

176b9ee

this is somewhat more useful now that we use nextest

some logging tweaking, i love tweaking logging

175e468

fix: make mesh_stacks reliable

cd26ca3

This is likely the bug that made all of these tests flaky. Let's see.

clippy

45a1561

move awaiting meshed-stack readyness to mesh_stacks

f6d540f

get network changes test back as well

93d1b31

flub changed the title ~~Facelift the magicsock roundtrip test~~ fix(iroh-net): Fix flaky magicsock tests Feb 21, 2024

flub marked this pull request as ready for review February 21, 2024 12:37

flub requested a review from dignifiedquire February 21, 2024 12:37

dignifiedquire reviewed Feb 21, 2024

View reviewed changes

flub added 2 commits February 21, 2024 15:20

give me better debug information for when this test hangs

891a6e2

Let's mark this flaky again to get the other fixes in

2f5e8f0

dignifiedquire approved these changes Feb 21, 2024

View reviewed changes

flub changed the title ~~fix(iroh-net): Fix flaky magicsock tests~~ fix(iroh-net): Fix some flaky magicsock tests Feb 21, 2024

flub enabled auto-merge February 21, 2024 15:43

flub added this pull request to the merge queue Feb 21, 2024

Merged via the queue into main with commit df57623 Feb 21, 2024
21 checks passed

flub deleted the flub/test-magicsock-roundtrip branch February 21, 2024 15:58

flub mentioned this pull request Feb 29, 2024

Flaky test: test_two_devices_roundtrip_quinn_magic #1966

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(iroh-net): Fix some flaky magicsock tests #2034

fix(iroh-net): Fix some flaky magicsock tests #2034

flub commented Feb 20, 2024 •

edited

Loading

dignifiedquire Feb 21, 2024

flub Feb 21, 2024

dignifiedquire Feb 21, 2024

flub Feb 21, 2024

dignifiedquire Feb 21, 2024

flub Feb 21, 2024

fix(iroh-net): Fix some flaky magicsock tests #2034

fix(iroh-net): Fix some flaky magicsock tests #2034

Conversation

flub commented Feb 20, 2024 • edited Loading

Description

Notes & open questions

Change checklist

dignifiedquire Feb 21, 2024

Choose a reason for hiding this comment

flub Feb 21, 2024

Choose a reason for hiding this comment

dignifiedquire Feb 21, 2024

Choose a reason for hiding this comment

flub Feb 21, 2024

Choose a reason for hiding this comment

dignifiedquire Feb 21, 2024

Choose a reason for hiding this comment

flub Feb 21, 2024

Choose a reason for hiding this comment

flub commented Feb 20, 2024 •

edited

Loading