Replies: 5 comments 28 replies
-
I am not sure GC is a particularly informative consideration. For example, it is pretty easy to make all the fiber instructions operate on a table of fibers without |
Beta Was this translation helpful? Give feedback.
-
I like the idea of making it a table. It's pretty easy for such non-GC producers to just track it with a free list. That does come with a caveat, though: there's not really any way to shrink tables after. |
Beta Was this translation helpful? Give feedback.
-
The proposal of using a wrapper to the underlying stack is exactly how OCaml implements suspended continuations, though the motivation for this choice is different. OCaml uses it to ensure "at most once" resumption of continuations. Allocation & GC are not the bottleneck in our use cases. Does the table of fibers approach somehow avoid the problem of unreachable cycles between suspended continuations? I assume that continuations are merely indices into the table? If that's the case, it would not be possible to know whether there are cycles, would it? In addition, some other mechanism would be needed to ensure the "at most once"/one-shot resumption property for continuations and avoid the analogue of "use-after-free" bug. One approach to avoid the "use-after-free" bug might be to represent continuations as a pair of an index into the table and a counter, the counter is incremented every time the continuation is resumed. Before resuming a continuation, one would need to check that the counter value in the continuation reference matches the current counter value at that index in the table. With the wrapper-based approach, "user-after-free" bug is avoided at the cost of an allocation per suspension. |
Beta Was this translation helpful? Give feedback.
-
Another idea came to mind. We already know that to support multithreaded work-stealing, we will (eventually) need a notion of "shareable" stacks that contain only "shareable" references. To be more precise, it's okay for an active stack to (temporarily) have non-shareable references, but they need to be guaranteed to be off the stack by the time any suspension occurs. There's various ways to achieve this (so I won't go into illustrating one), but it occurs to me that non-wasmgc engines could reuse a slight variant of the same device to require a suspended stack to have no references. That is, such an active stack could temporarily use references (e.g. fetching a fiberref from a table in order to switch to it) so long as they're not stored on the stack (e.g. not stored in a local) at any point a suspension could occur. This would prevent cycles while not requiring all stack instructions to operate on a table. It would also be helpful for even WasmGC engines, as it would allow them to skip scanning such stacks except during "deep" phases cleaning up instance objects and code objects. |
Beta Was this translation helpful? Give feedback.
-
I feel the overhead concerns are a little bit overblown regarding GC-style cycle collection for stacks. Realistically speaking, you're more likely to see hundreds at most for embedded runtimes, and at that scale, a mark and sweep algorithm is actually realistic. If they pack the mark/sweep bits separately, even 1000 of those is literally just 250 bytes (1 bit in-use + 1 bit allocated) of overhead compared to 4000 bytes just for the saved stack pointers alone. I don't see this scale presenting very many perf problems personally. For larger embedded systems, I expect there to be enough headroom that most runtimes would be willing to support WasmGC anyways, so I doubt it'd be too much of an issue, but even 10k (also an exceptionally high number) isn't too bad to enumerate IMHO. |
Beta Was this translation helpful? Give feedback.
-
A few other folks and I spent some time refining our understanding of the design space here after the in-person CG meeting, so I wanted to share a summary of what we worked through. Additional ideas and observations welcome!
Avoiding cycle detection in engines requires garbage in generator loops
To make it impossible to create unreachable cycles between suspended continuations, continuation references can be internally represented as pointers to single-use wrapper objects containing pointers to the underlying stack resources. The pointers in the wrapper objects are cleared whenever they are used to resume a continuation and a new wrapper object is allocated every time a continuation suspends. Setting up an unreachable cycle between continuations would involve resuming one of them to pass it a reference to the other, but that same resumption would break the cycle by clearing the reference that the other continuation holds. Allocating and freeing these wrapper objects may be a significant cost for uses such as tight generator loops.
Garbage-free generator loops require cycle detection in engines
To avoid generating garbage in tight generator loops, the internal representation of continuation references must contain pointers to the underlying stack resources rather than to single-use wrapper objects. These pointers remaining valid across multiple resumes makes it possible to create unreachable cycles between resumable continuations that simple reference counting would be unable to free.
Beta Was this translation helpful? Give feedback.
All reactions