Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs name resolve --stream #5404

Merged
merged 15 commits into from
Oct 18, 2018
Merged

ipfs name resolve --stream #5404

merged 15 commits into from
Oct 18, 2018

Conversation

magik6k
Copy link
Member

@magik6k magik6k commented Aug 27, 2018

Implements #5232

TODO:

  • Remove sync namesys code
  • Fix broken stuff
  • Wire up the command
  • Test the command

@magik6k magik6k requested a review from Kubuxu as a code owner August 27, 2018 23:10
@ghost ghost assigned magik6k Aug 27, 2018
@ghost ghost added the status/in-progress In progress label Aug 27, 2018
@magik6k magik6k force-pushed the feat/namestream branch 2 times, most recently from d25104d to 7485fe7 Compare August 28, 2018 14:31
return p, ttl, nil
func (r *IpnsResolver) searchValue(ctx context.Context, key string, opts ...ropts.Option) (<-chan []byte, error) {
if ir, ok := r.routing.(*dht.IpfsDHT); ok {
return ir.SearchValue(ctx, key, opts...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind:

  1. Fixing the interfaces.
  2. Taking over [Low prio] Add Watch() method to watch pubsub updates libp2p/go-libp2p-pubsub-router#3 (or start over), making it follow the new API.

Instead?

Copy link
Member Author

@magik6k magik6k Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| Taking over libp2p/go-libp2p-pubsub-router#3 (or start over), making it follow the new API.

I wanted to do that, but in a separate PR

Routing interfaces? Can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as-is, this'll fail if ipns-over-pubsub is enabled. We could possibly work around that but I'd rather not.

@magik6k magik6k mentioned this pull request Aug 30, 2018
4 tasks
@magik6k magik6k added the status/blocked Unable to be worked further until needs are met label Sep 24, 2018
@magik6k
Copy link
Member Author

magik6k commented Sep 24, 2018

Blocked on #5517

@magik6k magik6k removed the status/blocked Unable to be worked further until needs are met label Sep 24, 2018
@magik6k magik6k force-pushed the feat/namestream branch 4 times, most recently from 75175d4 to f030357 Compare September 24, 2018 14:20
@magik6k
Copy link
Member Author

magik6k commented Sep 25, 2018

Test fails will get fixed with libp2p/go-libp2p-routing-helpers#9

@magik6k magik6k added the status/blocked Unable to be worked further until needs are met label Sep 25, 2018
@magik6k
Copy link
Member Author

magik6k commented Sep 26, 2018

So there is a slight problem with SearchValue interface(doc)/implementations:

https://github.com/libp2p/go-libp2p-routing/blob/master/routing.go#L55

Implementations may halt the search after a period of time or may continue searching indefinitely.

Now that this PR changes namesys to use SearchValue under the hood this makes ipfs name resolve hang indefinitely when IPNS-Pubsub is enabled.

I think that we should change this to:
Implementations MUST stop the search as soon as a good value is found. A good value is a value which would be returned by GetValue.
It would make things more consistent, and if we ever want to get the 'run indefinitely' thing, we can add a routing option for that easily.

@Stebalien does that make sense?

@Stebalien
Copy link
Member

Implementations MUST stop the search as soon as a good value is found. A good value is a value which would be returned by GetValue.

My primary concern is that "good" isn't really well defined. We try to get some kind of consensus in the DHT but that only sort of works. We don't even try in pubsub, we just accept the first valid answer.

My thinking here was that users would be able to call ipfs name search QmId...., reading results of until they're happy. However, the issue with that is "happy". Callers really have no way to decide what a "good" value is either.

Really, I think I was conflating search and subscribe (which we should also have but that's a different story). You're probably right, the search should end at some point.

Questions:

  • What should pubsub do? Just do what we do for get?
  • What should we do when one search "ends"? Is there any way to signal "we got a good value" so we can stop searching other routers?

@magik6k
Copy link
Member Author

magik6k commented Sep 26, 2018

Really, I think I was conflating search and subscribe (which we should also have but that's a different story)

We can add a routing option for SearchValue which would make it into subscription as it's logic would be for the most part the same

What should pubsub do? Just do what we do for get?

Yep, I guess

What should we do when one search "ends"? Is there any way to signal "we got a good value" so we can stop searching other routers?

In case of routing-helpers.Parallel, If we query n routers and one of them closes its channel after sending results, we should close the output channel and cancel the other sub-requests (this is basically what happens in the GetValue case, we pick the first one).

The funny case here is when we get some value from router A, then some better value from router B, and then router A closes it's channel. I think we should still close the output channel and return the better value, but this feels a bit weird.

@Stebalien
Copy link
Member

In case of routing-helpers.Parallel, If we query n routers and one of them closes its channel after sending results, we should close the output channel and cancel the other sub-requests (this is basically what happens in the GetValue case, we pick the first one).

Yeah... I guess that's as-good or better than GetValue.

The funny case here is when we get some value from router A, then some better value from router B, and then router A closes it's channel. I think we should still close the output channel and return the better value, but this feels a bit weird.

This is weird because it indicates that A was wrong. That is, it returned a sub-optimal answer. A "smart" system would distrust A and then wait for B to finish. However, that may be too unpredictable. Worse, we could hang forever if the "winner" was pubsub.

I'd just go with your gut. We can experiment later.

We can add a routing option for SearchValue which would make it into subscription as it's logic would be for the most part the same

Sounds reasonable. We can even use options for termination strategy (we can add them as needed). One (not on by default) can be "continue forever".

License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
v, err := m.Resolve(ctx, name, opts...)
out <- namesys.Result{Path: v, Err: err}
close(out)
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never return the out stream.

Copy link
Contributor

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

I am curious why you changed several options parameters and return objects from pointers to structs.

I similarly do not feel qualified to comment on harder overall questions about the domain and approach for this PR, but it makes sense and seems good.

@magik6k magik6k requested a review from Kubuxu October 16, 2018 09:43
@magik6k
Copy link
Member Author

magik6k commented Oct 16, 2018

I am curious why you changed several options parameters and return objects from pointers to structs.

There isn't really a good reason to keep these as a pointer after applying options, this also potentially allows us to avoid allocations (though we realistically wouldn't feel this). CoreAPI is still using pointers for options currently, but is should use structs too.

@magik6k magik6k force-pushed the feat/namestream branch 3 times, most recently from b3cfb35 to fa3e9b9 Compare October 16, 2018 10:41

// Attach rest of the path
if len(segments) > 3 {
p, _ = path.FromSegments("", strings.TrimRight(p.String(), "/"), segments[3])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this also apply if res.err != nil? Shouldn't we have a if res.err != nil { out <- onceResult{err: res.err } above?

p, _ = path.FromSegments("", strings.TrimRight(p.String(), "/"), segments[3])
}

out <- onceResult{value: p, err: res.err}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably keep the ttl here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also select on the context.


resCh := res.resolveOnceAsync(ctx, key, options)
var best onceResult
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're repeating this logic quite a bit. Could we replace it with an andThen function? That is:

type mapFn func(ctx context.Context, value path.Path, ttl time.Duration) (path.Path, ttl.Duration, error)
func andThen(ctx context.Context, ch <-chan onceResult, cb mapFn) <-chan onceResult {
    // reads off `ch`, passes errors directly to the returned channel, applies the callback to
    // everything else.
}

This removes all the channel reading/writing/context logic. It may not help so don't spend too much time on it (or completely ignore this comment) but I get the feeling this'll DRY up quite a bit of the logic in this patch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic doesn't really de-duplicate that nicely, I did wrap channel sending though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah, we probably need generics.

case res, ok := <-resCh:
if !ok {
if best != (onceResult{}) {
ns.cacheSet(key, best.value, best.ttl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably just call this each time. That way, parallel requests never return old cached values. That should also make this simpler if we do go the andThen route.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make name resolve return outdated entries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the routers are returning records older than the cache? That's a bit annoying...

core/coreapi/name.go Show resolved Hide resolved
namesys/base.go Outdated
cancelSub()
}
subCtx, cancelSub = context.WithCancel(ctx)
defer cancelSub()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defering in a loop is generally a bad idea (will allocate for each). Let's just make sure to call cancelSub. Maybe a defer func() { if cancelSub != nil { cancelSub() } at the top, or something like that.

namesys/base.go Outdated
}
log.Debugf("resolved %s to %s", name, res.value.String())
if strings.HasPrefix(res.value.String(), "/ipfs/") {
outCh <- Result{Path: res.value}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Select on context.

namesys/base.go Outdated
}

if depth == 1 {
outCh <- Result{Path: res.value, Err: ErrResolveRecursion}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Select on context.

namesys/base.go Outdated
return
}
log.Debugf("resolved %s to %s", name, res.value.String())
if strings.HasPrefix(res.value.String(), "/ipfs/") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably carry the !prefix change from #5602 as it has removed https://github.com/ipfs/go-ipfs/pull/5404/files#diff-f43001ea38f40375075f7bccccfd9ab2L38.

Copy link
Contributor

@djdv djdv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor suggestion left over; other things were discussed and changed IRL

// Search is a version of Resolve which outputs paths as they are discovered,
// reducing the time to first entry
//
// Note that by default only the last path returned before the channel closes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pedantic / suggestion
Note: by default, all paths read from the channel are considered unsafe, except the latest (last path in channel read buffer).

@@ -7,12 +7,13 @@ import (
"strings"
"time"

ipath "gx/ipfs/QmdrpbDgeYH3VxkCciQCJY5LkDYdXtig6unDzQmMxFtWEw/go-path"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
Copy link
Contributor

@schomatis schomatis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also helps you learn about this subsystem.

@Stebalien I'm not sure I agree with that, reviewing a 600-line-diff of a PR with very few comments seems to have a cost-benefit ratio that makes it very inefficient to learn about any particular subsystem. I also fear that a thumbs-up on a PR that I don't really understand can generate more noise than signal.

That being said, I do agree that we need to help you with the review process to offload part of that work and free some of your time so you can dedicate it to other tasks.

The code seems good although I don't understand the general design.

License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
Copy link
Contributor

@kevina kevina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits.

// Resolve attempts to resolve the newest version of the specified name and
// returns its path.
func (api *NameAPI) Resolve(ctx context.Context, name string, opts ...caopts.NameResolveOption) (coreiface.Path, error) {
func (api *NameAPI) Search(ctx context.Context, name string, opts ...caopts.NameResolveOption) (<-chan coreiface.IpnsResult, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are docs for this on the interface (like all other coreapi functions). We may want to point people there, but I'd this in a separate PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just do this now. Separate PRs for documentation tend to not happen.

@@ -28,8 +28,8 @@ test_expect_success 'check namesys pubsub state' '

# These commands are *expected* to fail. We haven't published anything yet.
test_expect_success 'subscribe nodes to the publisher topic' '
ipfsi 1 name resolve /ipns/$PEERID_0;
ipfsi 2 name resolve /ipns/$PEERID_0;
ipfsi 1 name resolve /ipns/$PEERID_0 --timeout=1s;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the timeout now needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect these to fail (and subscribe to the name topic), now that resolving is based on SearchValue, pubsub resolving blocks if it doesn't have any good results. This timeout is needed so that this test doesn't block for 2 minutes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite unfortunate. In the past, resolve used to mean "go until we stop actively searching". Now, it means "go until we stop passively searching".

On the other hand, I doubt this'll make a difference in practice. On a real network, we never get negative results from the DHT, we just get timeouts.

@kevina
Copy link
Contributor

kevina commented Oct 16, 2018

@Stebalien does this still need a review from me? It can likely go over it, but it will take me several hours as I am not that familiar with that code path.

@Stebalien
Copy link
Member

@schomatis

I'm not sure I agree with that, reviewing a 600-line-diff of a PR with very few comments seems to have a cost-benefit ratio that makes it very inefficient to learn about any particular subsystem. I also fear that a thumbs-up on a PR that I don't really understand can generate more noise than signal.

"needs comments", "what the hell is this doing", etc. are all reasonable first-pass reviews.

Is there a better way to learn a subsystem? At the end of the day, you can never really trust code comments anyways (although I agree we should have more). Simply reading a subsystem's code is a good start but reviewing changes to a subsystem forces you to really understand how it works (it forces you to understand how those changes might affect the rest of the subsystem).

namesys/base.go Show resolved Hide resolved
@@ -28,8 +28,8 @@ test_expect_success 'check namesys pubsub state' '

# These commands are *expected* to fail. We haven't published anything yet.
test_expect_success 'subscribe nodes to the publisher topic' '
ipfsi 1 name resolve /ipns/$PEERID_0;
ipfsi 2 name resolve /ipns/$PEERID_0;
ipfsi 1 name resolve /ipns/$PEERID_0 --timeout=1s;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite unfortunate. In the past, resolve used to mean "go until we stop actively searching". Now, it means "go until we stop passively searching".

On the other hand, I doubt this'll make a difference in practice. On a real network, we never get negative results from the DHT, we just get timeouts.

License: MIT
Signed-off-by: Łukasz Magiera <[email protected]>
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just merge this before we get any more conflicts. @magik6k can you quickly make a follow-up PR to handle the documentation requests?

@Stebalien Stebalien merged commit 7de7928 into master Oct 18, 2018
@ghost ghost removed the status/in-progress In progress label Oct 18, 2018
@Stebalien Stebalien deleted the feat/namestream branch October 18, 2018 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants