Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

Add timeout to Provide call #8

Merged
merged 5 commits into from
Jun 30, 2019
Merged

Conversation

michaelavila
Copy link
Contributor

@michaelavila michaelavila commented Jun 12, 2019

This just adds the the ability to set a timeout that is used by go-bitswap when doing a Provide. Partially addresses #7.

This also makes the worker count configurable.

@michaelavila
Copy link
Contributor Author

@postables, @hsanjuan do you mind taking a look and reviewing this PR?

@michaelavila michaelavila force-pushed the fix/add-timeout-to-provide branch from 00c56a1 to c2e647a Compare June 12, 2019 18:14
@hsanjuan hsanjuan self-requested a review June 12, 2019 19:18
Copy link
Contributor

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I would prefer this was configurable. 3 minute is a bit arbitrary (being that it's a timeout for a content routing implementation and we don't know what Provide involves in it).

I also don't know if setting a timeout here might have unwanted side-effects. If it's just working around a DHT bug, what happens when that is fixed? Will 3 minutes still be the ok value? Or will disabling the timeout altogether be the right approach because the DHT will timeout by itself? What I mean is that it should probably default to "no timeout" (current behaviour) and optionally be configurable to timeout.

In short, probably exporting ProvideTimeout as a variable with a 0 default meaning "no timeout" is enough.

simple/provider.go Outdated Show resolved Hide resolved
@michaelavila michaelavila force-pushed the fix/add-timeout-to-provide branch from 915c96c to 94eb43f Compare June 12, 2019 19:44
@michaelavila michaelavila force-pushed the fix/add-timeout-to-provide branch from 94eb43f to c3bccce Compare June 12, 2019 19:46
@michaelavila
Copy link
Contributor Author

@hsanjuan that's all fair and good feedback, thanks. I took that into consideration and made some changes. Let me know what you think.

go func() {
for p.ctx.Err() == nil {
select {
case <-p.ctx.Done():
return
case c := <-p.queue.Dequeue():
ctx, cancel := context.WithTimeout(p.ctx, p.timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now default timeout is 0 and the context will be cancelled immediately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest changing the mockRouting in the tests to fail if the context is expired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll add a few tests around timing out.

@michaelavila michaelavila force-pushed the fix/add-timeout-to-provide branch from 19db4ce to 5999bf8 Compare June 12, 2019 20:27
@michaelavila michaelavila force-pushed the fix/add-timeout-to-provide branch from 5999bf8 to 5f6a572 Compare June 12, 2019 20:29
@Stebalien
Copy link
Member

I believe this will simply cause every single provide to timeout. Unfortunately, a timed-out provide is almost always completely useless (i.e., doesn't put anything in the DHT).

@michaelavila michaelavila changed the title Add 3 minute timeout to Provide call Add timeout to Provide call Jun 12, 2019
@bonedaddy
Copy link

I have done some testing of this PR, the revision I'm using according to go.mod is v0.0.0-20190612202929-5f6a572aacdc

Findings are recorded in the corresponding github issue

@michaelavila
Copy link
Contributor Author

michaelavila commented Jun 13, 2019

@Stebalien I think you're right and we just have some unexpected behavior, though I also think the issue has been around for a long time, which is why the timeout exists for the same call in go-bitswap. My thought here is to apply this fix as a band-aid, simply because it works and folks are being impacted, while we spend some time trying to find the underlying problem. I worry that we are not going to find a solution quickly. Thoughts?

@Stebalien
Copy link
Member

Discussed in the standup, apparently we ignore the context error in GetClosestPeers because we get it after we return a peer channel. Additionally, the underlying query will return both the closest peers and an error.

That means we'll be able to send provider records as long as none of these requests block for too long (and check the context).

As discussed, we should add a separate timeout (internally) on GetClosestPeers to work around this (for now).

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have libp2p/go-libp2p-kad-dht#351 so LGTM modulo the defer issue.

var cancel context.CancelFunc
if p.timeout > 0 {
ctx, cancel = context.WithTimeout(p.ctx, p.timeout)
defer cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defer in a loop.

@Stebalien Stebalien self-requested a review June 30, 2019 08:30
@Stebalien
Copy link
Member

I believe all the issues have now been fixed.

@Stebalien Stebalien merged commit 6346891 into master Jun 30, 2019
@Stebalien Stebalien deleted the fix/add-timeout-to-provide branch June 30, 2019 08:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants