Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k-bucket support for proper kad bootstrapping #38

Merged

Conversation

aarshkshah1992
Copy link
Contributor

@aarshkshah1992 aarshkshah1992 commented Aug 1, 2019

@aarshkshah1992
Copy link
Contributor Author

@Stebalien This is the k-bucket work for correct dht bootstrapping. Please review.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other random comment: these functions should have comments describing what they do and any requirements (e.g., the slice returned by GetAllBuckets must not be modified or appended to).

func (rt *RoutingTable) GetAllBuckets() []*Bucket {
rt.tabLock.RLock()
defer rt.tabLock.RUnlock()
return rt.Buckets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment explaining why this is safe (it's append-only).

if CommonPrefixLen(ConvertPeerID(peerID), rt.local) == targetCpl {
return peerID, err
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. This is very unfortunate. Given our network size, we should have at least 16 buckets). That's 2**16 hashes to find a peer in that last bucket.

I don't know how to fix this on our current DHT. Given how we've defined the protocol, we have to know the un-hashed key.

Honestly, I think the best solution is to pre-compute a table with a few known-good bootstrap keys in each of the first 20 buckets. If you can write a program to do this (ideally highly parallel), I can run it on a really beefy machine for a few days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing context here, but a couple comments:

  1. Why do we need a public non-test function for computing a random peer ID?
  2. Even if we needed to generate random peer IDs we don't need to hash anything since the randPeerID function does not return the material that gets hashed into the PeerID. As a result, if I have n buckets and want bucket i that has 20 matching prefix bits with me all I should need to do is randomly generate a single number from 0 to 2^(n-20) and then just prepend the matching 20 bits. So very little computation (one random number generation) should be required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann this is for libp2p/go-libp2p-kad-dht#375 (I've updated the PR description).

2

Due to how our DHT works, the we need to send the unhashed key with the request. That means we need to know the unhashed key. Kademlia as described by the paper assumes that all keys will be pre-hashed. If we wanted to do that, we'd have to pre-hash the keys when we store records in the datastore (e.g., peer routing information).

We could modify the DHT to send both the unhashed and hash keys but that seems like a lot of extra work just to support bootstrapping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien thx for the PR pointer. Fair enough about not wanting to do much changing to the dht message types if we don't have to. However, I'm not sure the solution of pre-generating keys helps. We can pre-generate keys for a single peer, but generating them for all peers is going to need require packaging a ton of keys (because the bucket distance is computed based on your local ID).

In any event it still doesn't seem like we need the hash function inside of randPeerID, just the one in ConvertPeerID.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, now I feel like an idiot. This isn't going to work (we'd need to ship ~1m peer IDs).

In any event it still doesn't seem like we need the hash function inside of randPeerID, just the one in ConvertPeerID.

The issue is that we need to make a DHT request with this key. To make the DHT request, we need to know the non-hashed key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this didn't need a beefy machine and we really don't need 1m peer IDs. I've created a branch with a table mapping 20 bit hash prefixes to numbers that can be used to generate peer IDs that hash to these 20 bit hash prefixes: https://github.com/libp2p/go-libp2p-kbucket/tree/feat/bucket-prefix-gen

More concretely, to find a peer ID in a bucket N, we'll have to:

  1. Generate a random 20 bit number.
  2. Hash our peer ID (sha256)
  3. Replace the first N bits with the first N bits from our hashed peer ID.
  4. Lookup this random number in the table in the table to get SEED.
  5. Encode SEED into a 32 byte slice using BigEndian ordering.
  6. Use this slice as the hash digest in a sha256 multihash.
  7. This multihash is the peer ID.

When this peer ID is hashed, the first N bits should match the result of step (2).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside is that this increases our binary size by 4MiB. But I'm not that concerned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reduced it to 16 bits (16 buckets) to reduce the size to 256KiB.

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien This was a really cool idea ! Have rebased on top of your branch & made all the changes. Let me know what you think.

PS: Made some minor changes to generate/main.go(line 31 & 58) & generated a new prefix-map so we can use it exactly as you've mentioned in the algorithm above.

Basically, instead of crafting the multi-hash like this :

[]byte{mh.Sha256, 32, 24 zero byte slices, 8 bytes of an Uint64}

We craft it as:

[]byte{mh.Sha256, 32, 8 bytes of an Uint64, 24 zero byte slices}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, apologies for the delay in getting this up. Things were a bit hectic at work.

table.go Outdated
if _, err := io.ReadFull(r, buf); err != nil {
return "", err
}
h, _ := mh.Sum(buf, mh.SHA2_256, -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From above:

In any event it still doesn't seem like we need the hash function inside of randPeerID, just the one in ConvertPeerID.

The issue is that we need to make a DHT request with this key. To make the DHT request, we need to know the non-hashed key.

Sorry for not making this clearer. We still have to hash to turn peerID into a kad appropriate hash. However, in randPeerID there's no reason we need to hash random bytes when we could just craft a multihash with some random bytes in it (e.g. mh.Encode(buf, mh.ID) might work well and buf only needs to be 8 bytes long since 2^8=256)

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann Good point ! We've done something similar now.

@Stebalien has come up with a mapping m such that

Let k be a random uint64.
Let crmh(k) = A crafted sha2-256 multihash where the hash digest is k.
Then, m[first sixteen bits of sha2-256(crmh(k))] = k.

I've rebased on top of his branch and changed the code accordingly. Do let me know what you think !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, clever that we can avoid sending some bits on the wire by just using a bunch of storage (proportional to global network size).

@aarshkshah1992 aarshkshah1992 force-pushed the feature/correct-kad-bootstrapping branch from b1d55d1 to 776df76 Compare August 11, 2019 10:58
table.go Outdated
log.Debugf("failed to generate random peerID in bucket %d, error is %+v", bucketID, err)
continue
// generate random 16 bits
r := rand.New(rand.NewSource(time.Now().UnixNano()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just single call to rand after rand.New can be biased. The rand should probably be in *RoutingTable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kubuxu That sounds interesting. Even if I seed it with the current time in nanos ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pseudo-random algorithms provide pseudo-random numbers if you use them for some time. There can be an initial bias if you use only the one number from the prng. It depends on the implementation of the prng itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we want to be able to manipulate the random number (i.e. for testing) why don't we just use rand's built in global? Are we concerned about contention on that lock?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the built-in global.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay. It's been a bit busy here.

bucket.go Outdated Show resolved Hide resolved
bucket.go Outdated
@@ -108,3 +127,5 @@ func (b *Bucket) Split(cpl int, target ID) *Bucket {
}
return newbuck
}

//go:generate go run ./generate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(should probably put this at the top, actually.

generate/main.go Outdated
@@ -55,7 +56,7 @@ func main() {

printf("package %s\n\n", pkg)
printf("// Code generated by generate/generate_map.go DO NOT EDIT\n")
printf("var keyPrefixMap = [...]uint32{")
printf("var keyPrefixMap = [...]uint64{")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint32 should be sufficient. Using uint64 will double the size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien Makes sense. Have also changed the map generation logic to use uint32 instead of uint64 to craft the hash digest.

table.go Outdated
log.Debugf("failed to generate random peerID in bucket %d, error is %+v", bucketID, err)
continue
// generate random 16 bits
r := rand.New(rand.NewSource(time.Now().UnixNano()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the built-in global.

generate/main.go Outdated
hasher := sha256.New()

for i := uint64(0); count < target; i++ {
binary.BigEndian.PutUint64(inp[2:], i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we had to do this. Putting this at the end should have worked just as well (but either way).

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think of it, you are right. This will work just as well if I put the integer in the same position while crafting the multihash.

Just seemed liked a good idea at that time 😄

table.go Outdated
}

// GenRandPeerID generates a random peerID in bucket=bucketID
func (rt *RoutingTable) GenRandPeerID(bucketID int) (peer.ID, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, let's just panic if the user passes a negative bucket. That's a program error, not a runtime error.

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien Thanks for the beautiful bit magic. Just had to make some changes to it. Have explained them in this comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also remove the error from the signature.

}

// test generate rand peer ID
for bucketID := 0; bucketID < nBuckets; bucketID++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test this with with more buckets (40?) to check for overflow issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien Generating 40 buckets takes a lot of time as a new bucket is created ONLY IF the last bucket is full. This slows down the tests. So testing with 21 for now as it's still an overflow & hits all the code-paths.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be slow... We shouldn't have to generate 40 buckets, just ask for peer IDs in buckets 1-40 (where everything after 16 would just be random IDs.

But this should be good enough.

Copy link
Contributor Author

@aarshkshah1992 aarshkshah1992 Aug 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien

Ah, I think I misunderstood your initial comment. So you meant simply passing in a bucketID of upto 40 to the GenRandPeerID method & not actually having 40 buckets. That makes more sense. Apologies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But really, anything over 16 is plenty.

Fix bootstrapping id generation logic
@aarshkshah1992
Copy link
Contributor Author

@Stebalien Have addressed your comments & made the changes. Please take a look. Thanks !

@aarshkshah1992 aarshkshah1992 force-pushed the feature/correct-kad-bootstrapping branch from 2e4fbe5 to 71292a3 Compare August 24, 2019 08:58
@aarshkshah1992 aarshkshah1992 force-pushed the feature/correct-kad-bootstrapping branch from 71292a3 to d5e5a48 Compare August 24, 2019 09:00
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small changes but otherwise LGTM. Thanks for catching the bit toggling issue!

table.go Outdated
@@ -70,7 +70,7 @@ func (rt *RoutingTable) GetAllBuckets() []*Bucket {
// GenRandPeerID generates a random peerID in bucket=bucketID
func (rt *RoutingTable) GenRandPeerID(bucketID int) (peer.ID, error) {
if bucketID < 0 {
return "", errors.New("bucketID must be non-negative")
panic(errors.New(fmt.Sprintf("bucketID %d is not non-negative", bucketID)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: error.New isn't necessary.

table.go Outdated
}

// GenRandPeerID generates a random peerID in bucket=bucketID
func (rt *RoutingTable) GenRandPeerID(bucketID int) (peer.ID, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also remove the error from the signature.

@aarshkshah1992
Copy link
Contributor Author

@Stebalien Have made all the changes. I think PR can be merged now. Thanks !

@Stebalien Stebalien merged commit 8b77351 into libp2p:master Aug 29, 2019
@raulk raulk changed the title k-bucket support for peoper kad bootstrapping k-bucket support for proper kad bootstrapping Nov 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants