-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose task queue diagnostics #302
Conversation
OK, so all things being equal and stable, with this I should be able to get a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine to me, 3 comments though:
- The reach-around for
toRequestId()
is unfortunate but I suppose impossible to deal with without exposing a larger surface area of those internal APIs or a major refactor. Maybe a warning that there's work to be done in the clarity of that API. - We don't have any duplicate ID checking at either collection or within
QueueDiagnostics()
(e.g. checking whether the ID exists in thematched*
maps). I know there's a bit scattered around elsewhere during actual execution, is that enough to put aside concerns about possible duplicates? PeerStats()
doesn't do any synchronization so isn't it theoretically possible to catch mismatching states between the requests and the worker queue due to timing differences in execution of the collection at those points? How much does that undermine the utility of thisQueueDiagnostics()
?
@rvagg I think you make some good points and I'm gonna finagle this a bit. |
@rvagg I think I've found a way to solve most of our problems. unless you object, i'm going to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌 will be interesting to see if this finds anything in production
feat: add WorkerTaskQueue#WaitForNoActiveTasks() for tests (#284) * feat: add WorkerTaskQueue#WaitForNoActiveTasks() for tests * fixup! feat: add WorkerTaskQueue#WaitForNoActiveTasks() for tests fix(responsemanager): fix flaky tests fix(responsemanager): make fix more global feat: add basic OT tracing for incoming requests Closes: #271 docs(tests): document tracing test helper utilities fix(test): increase 1s timeouts to 2s for slow CI (#289) * fix(test): increase 1s timeouts to 2s for slow CI * fixup! fix(test): increase 1s timeouts to 2s for slow CI testutil/chaintypes: simplify maintenance of codegen (#294) "go generate" now updates the generated code for us. The separate directory for a main package was unnecessary; a build-tag-ignored file is enough. Using gofmt on the resulting source is now unnecessary too, as upstream has been using go/format on its output for some time. Finally, re-generate the output source code, as the last time that was done we were on an older ipld-prime. ipldutil: use chooser APIs from dagpb and basicnode (#292) Saves us a bit of extra code, since they were added in summer. Also avoid making defaultVisitor a variable, which makes it clearer that it's never a nil func. While here, replace node/basic with node/basicnode, as the former has been deprecated in favor of the latter. Co-authored-by: Hannah Howard <[email protected]> fix: use sync.Cond to handle no-task blocking wait (#299) Ref: #284 Peer Stats function (#298) * feat(graphsync): add impl method for peer stats add method that gets current request states by request ID for a given peer * fix(requestmanager): fix tested method Add a bit of logging (#301) * chore(responsemanager): add a bit of logging * fix(responsemanager): remove code change chore: short-circuit unnecessary message processing Expose task queue diagnostics (#302) * feat(impl): expose task queue diagnostics * refactor(peerstate): put peerstate in its own module * refactor(peerstate): make diagnostics return array
Goals
Provide information about the Graphsync task queues in terms of what tasks are active and pending for a given peer
Implementation