Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature suggestion: strict "deterministic mode" #582

Closed
wanderer opened this issue Mar 2, 2016 · 19 comments
Closed

Feature suggestion: strict "deterministic mode" #582

wanderer opened this issue Mar 2, 2016 · 19 comments

Comments

@wanderer
Copy link
Contributor

wanderer commented Mar 2, 2016

What do you think about a strict deterministic mode? The point would be to disable any feature that was non-deterministic or define deterministic behavior for each possible non-deterministic feature.

Looking at Nondeterminism.md it doesn't seem there is much... but as features are added I think there will be more opportunity for nondeterminism to creep in.... and there are several usecases that require absolute determinism.

related disscussions

@sunfishcode
Copy link
Member

If a wasm producer knows it wants determinism, it is possible to insert extra code to canonicalize NaNs, to limit shared memory use to conservative patterns that can be proven safe, and to avoid depending on new features that aren't implemented everywhere. Are such things sufficient for your use cases, or are you looking for a mode where it's outright impossible to have nondeterminism?

It isn't an accident that there isn't much in Nondeterminism.md :-). There will similarly be pressure on future features to resist nondeterminism as well. That said, most likely there will be some future features that will really want it, though we intend to add such features to Nondeterminism.md to help people be aware of them.

@rossberg
Copy link
Member

rossberg commented Mar 2, 2016

Based on my experience with both implementing and evolving JavaScript, I would strongly suggest to avoid modes like the plague.

@titzer
Copy link

titzer commented Mar 2, 2016

I agree with @rossberg-chromium. What we have now is basically deterministic and any modes in the future will need to be strongly (pun intended at @rossberg-chromium :-)) motivated.

@wanderer
Copy link
Contributor Author

wanderer commented Mar 2, 2016

outright impossible to have nondeterminism

yeah my usecase nondeterminism needs to be impossible within the wasm environment itself... so thing like resources exhaustion I'm not worried about, but I do need to deal with things like nans, approximation ops, ect

@rossberg-chromium i got the idea of "modes" from reading the Subnormal section but i'm not to attached to the idea... as long as I can squeeze out all nondeterminism by transforming the ast :)

@AndrewScheidecker
Copy link

It looks like there's one source of non-determinism that's hard to avoid right now: call stack exhaustion. Apart from that it should be possible to use a deterministic subset of WebAssembly without any distinct "deterministic mode".

For call stack exhaustion, it might be sufficient to guarantee that the call stack can contain at least N calls of () -> () functions. An app that requires deterministic call stack exhaustion can put arguments and return values on a stack in instance memory, and ensure the call stack depth never exceeds N.

@jfbastien
Copy link
Member

The current approach would basically put the "mode" in the toolchain, instead of in the wasm VM. e.g. if / when we add "nondeterministic" opcodes then it'll be LLVM that uses them, V8 will just see a new opcode. @rossberg-chromium is that acceptable to you?

@wanderer
Copy link
Contributor Author

wanderer commented Mar 2, 2016

@AndrewScheidecker yep! i was just going to cap the call stack to something pretty low... like a depth of 1024.

@wanderer
Copy link
Contributor Author

wanderer commented Mar 2, 2016

but being able to cap the stack someway within the wasm spec would be even better

@jfbastien
Copy link
Member

Which stack? The hidden shadow one which backs locals, or the user-controlled one that's in the heap? The later is entirely toolchain controlled, but the former is highly dependent on the wasm VM and ISA (more spills, bigger frames, bigger shadow stack). I think allowing users to limit the shadow stack size would lead to more non-determinism.

@AndrewScheidecker
Copy link

I wasn't suggesting to provide a deterministic limit on the shadow stack, but to provide a deterministic minimum for the shadow stack, provided you don't use function parameters or return values. I wasn't thinking about locals, but of course it's possible to put those in a user-controlled stack on the heap along with parameters and return values.

Then if an app is willing to sacrifice performance by not using parameters/locals/return-values, and can limit its own call stack depth, that would provide a subset of WebAssembly that has deterministic call stack exhaustion.

EDIT: I was also not thinking about spills of intermediate values. Never mind, I guess this wouldn't work!

@sunfishcode
Copy link
Member

We also can't guarantee that the device running a WebAssembly program isn't going to be powered off at any moment. Or a browser tab could be closed, or the OOM killer asynchronously takes down the process, or other things happen. It's not clear what a guaranteed minimum callstack would achieve.

@rossberg
Copy link
Member

rossberg commented Mar 2, 2016

@jfbastien, yes sure, that's fine. Compilers can of course generate whatever code they see fit.

@wanderer, as far as the Wasm language is concerned, there is no such thing as a stack. That's an implementation detail. And I agree with @sunfishcode that it's hard to provide any useful guarantees about a resource like that.

@wanderer
Copy link
Contributor Author

wanderer commented Mar 2, 2016

@rossberg-chromium ok sounds good, lets not worry about stacks! l think there are 2 type of nondeteremism external things like resource exhaustion and VMs being shutdown ; and internal nondeteremism things like NaN always coming out the same, ect. stack limits would fall under external nondeteremism.

So i guess right now I just want to focus on internal nondeteremism... with the threat model of untrusted code trying to create a nondeterministic event in the VM.

it is possible to insert extra code to canonicalize NaNs

@sunfishcode if you had malicious code that was trying to nondeteremism wouldn't it be able to anticipate this?

@sunfishcode
Copy link
Member

Can you describe your setting in more detail? Any intetesting program will utilize APIs and bring in outside information anyway, so it's not immediately obvious what the threat model is.

Also, is this in a Web context? Can the malicious code also execute arbitrary JS code? Would a tool to preprocess or scan code before running it be feasible to consider?

@wanderer
Copy link
Contributor Author

wanderer commented Mar 2, 2016

sure @sunfishcode

Also, is this in a Web context?

No, this is in the context of a distributed computation. the consensus is provided by something like paxos or raft. Validators run the same code in the VM and they all need to come to consensus on the result... so if someone was malicious they would try to break consensus by introducing nondetermistic code.

Would a tool to preprocess or scan code before running it be feasible to consider?

Yep Definitely!, code here is not ephemeral, the validators would run the same piece of code many times, so having a one time preprocess is fine

@sunfishcode
Copy link
Member

For NaN bits, it'd be straightforward for a preprocessor to inject NaN canonicalization code after every floating-point computation, and that would suffice.

For threads with shared memory, it's not quite so easy. I imagine you could design a specific idiom for using shared memory in a conservative and provably-safe way, and then validate that all code using shared array follows that idiom. Or if you don't need threads, the simplest thing would be to just disallow threads altogether.

@lukewagner
Copy link
Member

Alternatively, if you control the engine, you could give that engine a deterministic execution mode (e.g., for threads, it could use a single OS thread and green threading to switch contexts in a deterministic way).

@wanderer
Copy link
Contributor Author

wanderer commented Mar 2, 2016

For NaN bits, it'd be straightforward for a preprocessor to inject NaN canonicalization code after every floating-point computation, and that would suffice.

i'm happy with that

if you control the engine, you could give that engine a deterministic execution mode

@lukewagner yeah I think that would work

@sunfishcode
Copy link
Member

The main questions here is answered; please re-open or open a new issue if you have further questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants