-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature suggestion: strict "deterministic mode" #582
Comments
If a wasm producer knows it wants determinism, it is possible to insert extra code to canonicalize NaNs, to limit shared memory use to conservative patterns that can be proven safe, and to avoid depending on new features that aren't implemented everywhere. Are such things sufficient for your use cases, or are you looking for a mode where it's outright impossible to have nondeterminism? It isn't an accident that there isn't much in Nondeterminism.md :-). There will similarly be pressure on future features to resist nondeterminism as well. That said, most likely there will be some future features that will really want it, though we intend to add such features to Nondeterminism.md to help people be aware of them. |
Based on my experience with both implementing and evolving JavaScript, I would strongly suggest to avoid modes like the plague. |
I agree with @rossberg-chromium. What we have now is basically deterministic and any modes in the future will need to be strongly (pun intended at @rossberg-chromium :-)) motivated. |
yeah my usecase nondeterminism needs to be impossible within the wasm environment itself... so thing like resources exhaustion I'm not worried about, but I do need to deal with things like nans, approximation ops, ect @rossberg-chromium i got the idea of "modes" from reading the Subnormal section but i'm not to attached to the idea... as long as I can squeeze out all nondeterminism by transforming the ast :) |
It looks like there's one source of non-determinism that's hard to avoid right now: call stack exhaustion. Apart from that it should be possible to use a deterministic subset of WebAssembly without any distinct "deterministic mode". For call stack exhaustion, it might be sufficient to guarantee that the call stack can contain at least N calls of |
The current approach would basically put the "mode" in the toolchain, instead of in the wasm VM. e.g. if / when we add "nondeterministic" opcodes then it'll be LLVM that uses them, V8 will just see a new opcode. @rossberg-chromium is that acceptable to you? |
@AndrewScheidecker yep! i was just going to cap the call stack to something pretty low... like a depth of 1024. |
but being able to cap the stack someway within the wasm spec would be even better |
Which stack? The hidden shadow one which backs locals, or the user-controlled one that's in the heap? The later is entirely toolchain controlled, but the former is highly dependent on the wasm VM and ISA (more spills, bigger frames, bigger shadow stack). I think allowing users to limit the shadow stack size would lead to more non-determinism. |
I wasn't suggesting to provide a deterministic limit on the shadow stack, but to provide a deterministic minimum for the shadow stack, provided you don't use function parameters or return values. I wasn't thinking about locals, but of course it's possible to put those in a user-controlled stack on the heap along with parameters and return values. Then if an app is willing to sacrifice performance by not using parameters/locals/return-values, and can limit its own call stack depth, that would provide a subset of WebAssembly that has deterministic call stack exhaustion. EDIT: I was also not thinking about spills of intermediate values. Never mind, I guess this wouldn't work! |
We also can't guarantee that the device running a WebAssembly program isn't going to be powered off at any moment. Or a browser tab could be closed, or the OOM killer asynchronously takes down the process, or other things happen. It's not clear what a guaranteed minimum callstack would achieve. |
@jfbastien, yes sure, that's fine. Compilers can of course generate whatever code they see fit. @wanderer, as far as the Wasm language is concerned, there is no such thing as a stack. That's an implementation detail. And I agree with @sunfishcode that it's hard to provide any useful guarantees about a resource like that. |
@rossberg-chromium ok sounds good, lets not worry about stacks! l think there are 2 type of nondeteremism external things like resource exhaustion and VMs being shutdown ; and internal nondeteremism things like NaN always coming out the same, ect. stack limits would fall under external nondeteremism. So i guess right now I just want to focus on internal nondeteremism... with the threat model of untrusted code trying to create a nondeterministic event in the VM.
@sunfishcode if you had malicious code that was trying to nondeteremism wouldn't it be able to anticipate this? |
Can you describe your setting in more detail? Any intetesting program will utilize APIs and bring in outside information anyway, so it's not immediately obvious what the threat model is. Also, is this in a Web context? Can the malicious code also execute arbitrary JS code? Would a tool to preprocess or scan code before running it be feasible to consider? |
sure @sunfishcode
No, this is in the context of a distributed computation. the consensus is provided by something like paxos or raft. Validators run the same code in the VM and they all need to come to consensus on the result... so if someone was malicious they would try to break consensus by introducing nondetermistic code.
Yep Definitely!, code here is not ephemeral, the validators would run the same piece of code many times, so having a one time preprocess is fine |
For NaN bits, it'd be straightforward for a preprocessor to inject NaN canonicalization code after every floating-point computation, and that would suffice. For threads with shared memory, it's not quite so easy. I imagine you could design a specific idiom for using shared memory in a conservative and provably-safe way, and then validate that all code using shared array follows that idiom. Or if you don't need threads, the simplest thing would be to just disallow threads altogether. |
Alternatively, if you control the engine, you could give that engine a deterministic execution mode (e.g., for threads, it could use a single OS thread and green threading to switch contexts in a deterministic way). |
i'm happy with that
@lukewagner yeah I think that would work |
The main questions here is answered; please re-open or open a new issue if you have further questions! |
What do you think about a strict deterministic mode? The point would be to disable any feature that was non-deterministic or define deterministic behavior for each possible non-deterministic feature.
Looking at Nondeterminism.md it doesn't seem there is much... but as features are added I think there will be more opportunity for nondeterminism to creep in.... and there are several usecases that require absolute determinism.
related disscussions
The text was updated successfully, but these errors were encountered: