feat(mdns): allow setting an initial delay before sending queries to discover peers. #3323

stormshield-damiend · 2023-01-13T14:07:31Z

Description

feat(mdns): allow setting an initial delay before sending queries to discover peers (#3319)

Notes

In a similar way to identify behaviour, manage an initial delay before the first mdns query is sent on an interface.

In a docker environment, the first mdns query can be lost if the interface is not ready yet. As the interval query is by default rather large (5 minutes), using an initial delay before the first mdns query is sent would improve the speed of peers discovery.

Links to any relevant issues

Fixes #3319

Open Questions

I have kept the original behaviour, the initial delay is set by default to 0, perhaps this could be set to another default value like for example 500ms as done for identify.

Change checklist

I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
A changelog entry has been made in the appropriate crates

discover peers.

thomaseizinger

I'd appreciate if we don't add another configuration option for this. I consider #3319 to be a bug that we should fix.

This PR only adds a configuration that in its default value does not fix the bug.

Have you measured, what such an initial delay would have to be set to to be effective? I'd be happy to merge a PR that adds a fixed 500ms delay like identify as long as we don't expose it as a config option.

stormshield-damiend · 2023-01-16T09:01:19Z

On our test platform, 500ms does the job (this is the value we use) but this highly depend on the way your dockers are connected and init (in our case we manage the network connections between our dockers by hands, thus it take some time to become ready).

Identify has 500ms as default value and a method on Config to configure the initial_delay.

I can modify the PR to hardcode the 500ms and remove the configuration part, or i can also use 500ms as default value (to fix the issue for some case) and have the configuration token available to fix other cases where 500ms is not enough.

What do you prefer ?

thomaseizinger · 2023-01-16T10:35:22Z

It would be good to get @mxinden's opinion. My personal preference would be to not expose another configuration option. I also consider the initial delay in identify a hack if it was added for the same reason.

In an ideal world, I think we should fix the actual bug which IMO is because we unconditionally wait for the delay and don't care about no / failure responses.

stormshield-damiend · 2023-01-23T08:34:01Z

Sure, lets wait for mxinden's opinion

stormshield-damiend · 2023-02-20T10:27:57Z

did you get any update from @mxinden ?

thomaseizinger · 2023-02-21T01:00:15Z

I vaguely remember that we briefly discussed briefly and I think we are on the same page. Our components should do the right thing in their default configuration in ideally every situation and not just when they are configured in a certain way.

Not increasing the public API but making the implementation more resilient to a corner case like this is a low-friction change that I am happy to merge without further input as it doesn't have any downsides.

Let me know if you want to work on this!

stormshield-damiend · 2023-02-21T08:27:36Z

To summarize, are you fine with

set the default initial delay to 500ms
keep the api to change it if needed (as done in identify)

If so i can re-work the PR and submit an updated version.
If you prefer only the 1. (500ms without the public API) i am also fine with this. Just tell me.

thomaseizinger · 2023-02-22T03:36:29Z

Uhm no, that is not quite what I meant. I don't think an initial delay is the appropriate solution. It doesn't address the actual problem: not having received a response for a request that we sent out.

mDNS is multicast though and based on UDP, so a response is not necessarily guaranteed. Additionally, we could also just be in a network where there are no other mDNS devices.

Thus I think an appropriate solution would be:

Instead of sending out requests with a fixed interval build a state machine that:

Sends out a request, start a short timer (10s or something)
If no response within 10s, send another request
Repeat the above Y times or until receive a response
In either case, move to a longer timeout once we received a response

Essentially, what this does is search for devices at an increased rate until we either find some or we consider there to be no devices.

This should be a lot more resilient to any form of problems in the network.

Let me know what you think of this solution.

mxinden · 2023-02-24T08:08:20Z

Agreed.

That sounds reasonable to me.

Maybe also worth checking out whether go-libp2p has a solution for this issue.

https://github.com/libp2p/go-libp2p/blob/master/p2p/discovery/mdns/mdns.go

stormshield-damiend · 2023-02-24T09:23:35Z

Go implementation uses zeroconf library which does this for the mdns client part :

max query interval is hard coded to 60s
initial query interval is hard coded to 4s
automata send queries until a response is received. it increases the periodic delay in case of no response or error using a random multiplier between 1.5 and 2.5, the delay being capped at max interval (60s)

Do you want we use the same values for the rust state automata ?

stormshield-damiend · 2023-02-24T09:26:02Z

Ok we (stormshield) will work on this new solution, you can drop this PR we will submit a new one with the updated code. It may take some time as we have other work to do internally.

mergify · 2023-05-01T02:28:38Z

This pull request has merge conflicts. Could you please resolve them @stormshield-damiend? 🙏

thomaseizinger · 2023-05-24T12:31:25Z

Obsoleted by #3975.

Peer discovery with mDNS can be very slow, particularly if the first mDNS query is lost. This patch resolves it by adjusting the timer with an adaptive initial interval. We start with a very short timer (500 ms). If a peer is discovered before the end of the timer, then the timer is reset to the normal query interval value (300s), otherwise the timer's value is multiplied by 2 until it reaches the normal query interval value. Related: #3323. Resolves: #3319. Pull-Request: #3975.

feat(mdns): allow setting an initial delay before sending queries to

abaca78

discover peers.

thomaseizinger requested changes Jan 16, 2023

View reviewed changes

stormshield-pj50 mentioned this pull request May 22, 2023

feat(mdns): adaptive initial interval peer discovery #3975

Merged

4 tasks

thomaseizinger closed this May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mdns): allow setting an initial delay before sending queries to discover peers. #3323

feat(mdns): allow setting an initial delay before sending queries to discover peers. #3323

stormshield-damiend commented Jan 13, 2023 •

edited

Loading

thomaseizinger left a comment

stormshield-damiend commented Jan 16, 2023

thomaseizinger commented Jan 16, 2023

stormshield-damiend commented Jan 23, 2023

stormshield-damiend commented Feb 20, 2023

thomaseizinger commented Feb 21, 2023

stormshield-damiend commented Feb 21, 2023

thomaseizinger commented Feb 22, 2023

mxinden commented Feb 24, 2023

stormshield-damiend commented Feb 24, 2023 •

edited

Loading

stormshield-damiend commented Feb 24, 2023

mergify bot commented May 1, 2023

thomaseizinger commented May 24, 2023

feat(mdns): allow setting an initial delay before sending queries to discover peers. #3323

feat(mdns): allow setting an initial delay before sending queries to discover peers. #3323

Conversation

stormshield-damiend commented Jan 13, 2023 • edited Loading

Description

Notes

Links to any relevant issues

Open Questions

Change checklist

thomaseizinger left a comment

Choose a reason for hiding this comment

stormshield-damiend commented Jan 16, 2023

thomaseizinger commented Jan 16, 2023

stormshield-damiend commented Jan 23, 2023

stormshield-damiend commented Feb 20, 2023

thomaseizinger commented Feb 21, 2023

stormshield-damiend commented Feb 21, 2023

thomaseizinger commented Feb 22, 2023

mxinden commented Feb 24, 2023

stormshield-damiend commented Feb 24, 2023 • edited Loading

stormshield-damiend commented Feb 24, 2023

mergify bot commented May 1, 2023

thomaseizinger commented May 24, 2023

stormshield-damiend commented Jan 13, 2023 •

edited

Loading

stormshield-damiend commented Feb 24, 2023 •

edited

Loading