-
Notifications
You must be signed in to change notification settings - Fork 24
fix: give one minute timeouts to function calls instead of block retrievals #44
Changes from 1 commit
d0512ce
c2dfede
461c266
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,6 +72,10 @@ func (r *Resolver) ResolveToLastNode(ctx context.Context, fpath path.Path) (cid. | |
// create a selector to traverse and match all path segments | ||
pathSelector := pathAllSelector(p[:len(p)-1]) | ||
|
||
// create a new cancellable session | ||
ctx, cancel := context.WithTimeout(ctx, time.Minute) | ||
defer cancel() | ||
|
||
// resolve node before last path segment | ||
nodes, lastCid, depth, err := r.resolveNodes(ctx, c, pathSelector) | ||
if err != nil { | ||
|
@@ -132,6 +136,10 @@ func (r *Resolver) ResolvePath(ctx context.Context, fpath path.Path) (ipld.Node, | |
// create a selector to traverse all path segments but only match the last | ||
pathSelector := pathLeafSelector(p) | ||
|
||
// create a new cancellable session | ||
ctx, cancel := context.WithTimeout(ctx, time.Minute) | ||
defer cancel() | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is probably still going to exhibit roughly the same behaviour as before, because it's still cancelling by the time the method returns, and the method's return is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There don't seem to be too many users of this. There's go-ipfs' https://github.com/ipfs/go-ipfs/blob/92854db7aed4424fad117ceb4e13f64a80ff348b/fuse/readonly/readonly_unix.go#L70 From the way it's used there it doesn't seem to be expecting to return a HAMT. @hannahhoward can you confirm that a HAMT pointing to a HAMT won't cause issues with the above code? The function also seems to be used https://github.com/filecoin-project/lotus/blob/4fc78bf72037ada0cb3edd0f551172428ddbc9e8/node/impl/full/chain.go#L593, which exhibits the concerns you were indicating. It doesn't deal with UnixFS HAMTs, but ... that's not a great feeling. Should be fine if we drop the timeouts per #44 (comment) though. |
||
nodes, c, _, err := r.resolveNodes(ctx, c, pathSelector) | ||
if err != nil { | ||
return nil, nil, err | ||
|
@@ -172,6 +180,10 @@ func (r *Resolver) ResolvePathComponents(ctx context.Context, fpath path.Path) ( | |
// create a selector to traverse and match all path segments | ||
pathSelector := pathAllSelector(p) | ||
|
||
// create a new cancellable session | ||
ctx, cancel := context.WithTimeout(ctx, time.Minute) | ||
defer cancel() | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is probably still going to exhibit roughly the same behaviour as before, because it's still cancelling by the time the method returns, and the method's return is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if |
||
nodes, _, _, err := r.resolveNodes(ctx, c, pathSelector) | ||
if err != nil { | ||
evt.Append(logging.LoggableMap{"error": err.Error()}) | ||
|
@@ -218,10 +230,6 @@ func (r *Resolver) ResolveLinks(ctx context.Context, ndd ipld.Node, names []stri | |
// Finds nodes matching the selector starting with a cid. Returns the matched nodes, the cid of the block containing | ||
// the last node, and the depth of the last node within its block (root is depth 0). | ||
func (r *Resolver) resolveNodes(ctx context.Context, c cid.Cid, sel ipld.Node) ([]ipld.Node, cid.Cid, int, error) { | ||
// create a new cancellable session | ||
ctx, cancel := context.WithTimeout(ctx, time.Minute) | ||
defer cancel() | ||
|
||
session := r.FetcherFactory.NewSession(ctx) | ||
|
||
// traverse selector | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, because the method's return is
(cid.Cid, []string, error)
-- there's noNode
which might be a HAMT or Unixfs node or other ADL that people might ask things of later and be surprised if it says "i'm cancelled", so it's good.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also seems to be the one that's used by go-ipfs and caused the reported error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps for all the other ones aside from this we just remove the timeout.
@Stebalien you were thinking that leaving these timeouts in might make debugging changes here easier (#36 (comment)). Any objection to leaving this one in and removing the rest?
The change here means we have 1min to resolve the function instead of 1min per block, but that should be way more than enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember why that had anything to do with tracking down bugs... (might have been WRT keeping changes minimal?).
I'm generally against timeouts in places like this where there's no reasonable "we should have received a response by X time" so I'd rather remove all of the hard-coded timeouts like this eventually.