You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The majority of the overhead of interpreters and in particular wasmi interpreter is the overhead of the instruction dispatch.
Therefore there are 3 main ways to improve efficiency of efficient interpreters:
Improve the performance of the dispatch routines, i.e. reduce their overhead.
Reduce the amount of executed instructions, e.g. by combining instructions into super instructions.
Help the CPU branch predictor to correctly predict the next branch. This is due to the fact that instruction dispatch usually consists of at least one indirect branch. It is possible to help the CPU utilize better branch prediction by providing it with more information. For example having only a single branch when using a single match statement for the dispatch routine is less efficient than having a branch per instruction (match arm) since the branch predictor can include the position of the branch into account for its prediction. Some benchmark indicate 50%-100% performance gains.
Work Items
Fuse common instruction sequences into super instructions for wasmi bytecode during Wasm module compilation.
We decided to not follow this route anymore. Instead we concentrate on getting the register machine approach working.
LLVM is able to optimize switch based dispatch into one where branch predictors will benefit more at the cost of increased binary size. LLVM usually opts out of this to our despair. It might be possible to find ways to make LLVM optimize into that form from within Rust.
LLVM already supports guaranteed tail calls. As soon as Rust provides them too we should definitely experiment with dispatch based on tail calls similar to the Wasm3 interpreter.
As stated both Rust and WebAssembly currently do not have tail call support. We can reopen this issue or create a new issue once this has changed.
The text was updated successfully, but these errors were encountered:
The majority of the overhead of interpreters and in particular
wasmi
interpreter is the overhead of the instruction dispatch.Therefore there are 3 main ways to improve efficiency of efficient interpreters:
match
statement for the dispatch routine is less efficient than having a branch per instruction (match arm) since the branch predictor can include the position of the branch into account for its prediction. Some benchmark indicate 50%-100% performance gains.Work Items
Fuse common instruction sequences into super instructions forwasmi
bytecode during Wasm module compilation.switch
based dispatch into one where branch predictors will benefit more at the cost of increased binary size. LLVM usually opts out of this to our despair. It might be possible to find ways to make LLVM optimize into that form from within Rust.LLVM already supports guaranteed tail calls. As soon as Rust provides them too we should definitely experiment with dispatch based on tail calls similar to the Wasm3 interpreter.The text was updated successfully, but these errors were encountered: