Create a live sanity checker #14

mxgrey · 2024-08-03T12:45:21Z

Before proceeding, is there an existing issue or discussion for this?

I have done a search for similar issues and discussions.

Description

bevy_impulse is completely event-driven which is good for efficiency, but introduces a non-zero risk that a subtle uncaught bug somewhere in the implementation (perhaps an extremely rare race condition) could cause execution of a workflow to get permanently stuck.

While we strive to not have any bugs at all in the implementation, the reality of the risk should not be ignored. We should implement a system that can periodically audit the ongoing workflows to verify that all activities are running as expected and that no workflows have come to an unexplained stop. If anything is found to be out of order, the affected workflow should be cancelled and the situation should be logged in UnhandledErrors in as much detail as possible.

Things to examine:

Can all ongoing scoped sessions reach their terminal nodes?
Are all buffers of finished session cleared out?
Are all inputs of finished sessions cleared out?
Are all finished impulses despawned?

This tool should be applied to every test in the library. We should also make it something that users can configure to run periodically to prevent problematic halts in deployments, although it should not be run very frequently.

The text was updated successfully, but these errors were encountered:

mxgrey added this to PMC Board Aug 3, 2024

github-project-automation bot moved this to Inbox in PMC Board Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a live sanity checker #14

Create a live sanity checker #14

mxgrey commented Aug 3, 2024

Create a live sanity checker #14

Create a live sanity checker #14

Comments

mxgrey commented Aug 3, 2024

Before proceeding, is there an existing issue or discussion for this?

Description