-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kad: improve FIND_NODE response definition #609
base: master
Are you sure you want to change the base?
Conversation
Probably some dumb questions. Wouldnt it make sense for the requestor to filter responses rather than peers not providing accurate responses? If im asking for the closest N nodes to a key, a node adding itself to the response (if it is one of the closest), or adding the requestor, is more truthful than not. Is find-node only used when looking for peers to add provider records to or are there other uses? how are the responses currently processed for all find-node calls? Why wouldnt the requestor filter responses as necessary? They are already keeping a set of queried nodes. Can we guarantee that a prior find-node call that returned the current requestee was already processed? |
I think that in most implementations the requestor is already filtering the response, which is good. Not sending these two peers is about saving space on the wire. A response including either one of these peers will only contain useful information about
|
First of all, I think it's indeed not spec-compliant to only return a single record when the spec says to always return the In my mental model, a peer is always in its own 255th bucket so it must be part of the response set. This means I'm siding with @SgtPooki here when he says:
However, I think the most important consideration is if any of the options (including the peers or not) has an impact on the correct operation of the protocol. I would argue, in terms of routing soundness it doesn't matter, right? If it does then we would have a clear winner. Some references I have found from git blame:
|
Apologies if this is off topic or my understanding is completely incorrect.
if the requester is in the server's routing table, we will tell the peer about itself because it is in the closest slice.
|
@sukunrt these issues are specific to go-libp2p-kad-dht.
It is possible that we are connected to a node, but it isn't in the routing table (e.g because there are better candidates). Adding |
This suggestion in combination with libp2p/go-libp2p-kad-dht#820 makes it really hard to bootstrap a DHT from scratch. Since initially no nodes have any peers in their routing tables, the PR above means This is fine if you're joining an already established DHT but makes creating one almost impossible, unless I'm missing something. I think maybe you include your own PeerInfo in the response to a request for your own PeerId. Granted it's not the most efficient thing to do, but arguably querying a peer for it's own PeerId doesn't make much sense, unless you're doing it as an effective no-op as |
Good point @achingbrain! I think that the I just opened a PR to address this libp2p/go-libp2p-kad-dht#970
An example where this query makes sense could be a reputation system where peers gets a grade from other network participants. The grade would be stored on the |
Yes, my point was from the point of view of a "find closer peers"-type query it doesn't make sense, one where you want to find out more information about the network, find out who to dial next, etc, not that there was no reason to do it. There are plenty of reasons to do it as we both point out. 😄 |
Context
The Kademlia spec doesn't explicitly define how a node
N
should answer to the requestFIND_NODE(N)
. TheFIND_NODE
RPC is expected to return the (k
) closest nodes to the requested key.Implementations have diverse behaviors and expectations, which can lead to poor interoperability (see libp2p/go-libp2p-kad-dht#966).
None of these implementations actually follows the spec, so they should be adapted to return the
k
closest peers.Proposal
k
known closest peers to the requested key.FIND_NODE(N)
toN
already haveN
's peer record. Even though the peer record sent byN
could include additional multiaddresses that the requestor isn't aware of yet, this is the role ofIdentify
, not the DHT.FIND_NODE
response should never include the requester among the closer peers.