Skip to content

WebAssembly Standalone

Alon Zakai edited this page Dec 30, 2016 · 40 revisions

By default emcc creates a combination of JS files and WebAssembly, where the JS loads the WebAssembly which contains the compiled code. There is also progress towards an option to emit standalone WebAssembly files, which is detailed here.

Status: This is all a work in progress! Things will change and break.

Overview

This approach to creating a standalone WebAssembly module is based on Emscripten's side module concept. A side module makes sense here, since it is a form of shared library, and does not link in system libraries automatically, etc. - it is a self-contained compilation output.

Usage

Currently you need the incoming branches in emscripten. master is fine for binaryen.

Build with

emcc [params for your input files, optimization level, etc.] -s WASM=1 -s SIDE_MODULE=1 -o target.wasm

To use a side module, see loadWebAssemblyModule in src/runtime.js for some example loading code. An explanation of how that works is next.

Details

  • A shared library is a wasm file with a "dylink" user section, that must be the first section. It contains two fields, an unsigned LEB for the space needed for the module's memory segments, then another for the space needed for the table segment. We need this since we need to create the room for both those memory segments and table segments in the memory and table before we load the module, and these sizes may be larger than the module's segments (e.g. if the module wants extra room to manage a stack).
  • The module should import env.memoryBase and use that as where to place memory segments, and env.tableBase for table segments. The JS loader ensures there is room there.
  • env.memory is the imported memory, env.table the imported table.
  • If the module has any code it needs to run for initialization (relocations, global constructors, etc.) then it can export a __post_instantiate. The loader calls that after creating the module. (Note: we can't use the WebAssembly start method due to reentrancy issues.)
  • Exporting functions is straightforward, the functions themselves are just exported, and the loader notes them and their symbol name based on the export name.
  • Exported globals must be relocated by the loader, as the module cannot add in the relocation itself, it immediately exports the values before it can run any code. For example, if global foo is at relative address 8 in the module's memory, then the module should export 8. The loader, which passed in the relocation offset (as memoryBase) then adds it to that value, giving the final absolute address.
  • Note that there is no special handling of a C stack. A module can have one internally if it wants one (it needs to ask for the memory for it, then handle it however it wants).

Example

If you build

#include <stdio.h>

int main() {
  printf("hello, world!\n");
  return 0;
}

with emcc -s WASM=1 -s SIDE_MODULE=1 -Os -g (-Os optimizes for size; -g keeps function names in the binary), then the wasm output is

(module
  (type $FUNCSIG$ii (func (param i32) (result i32)))
  (import "env" "gb" (global $gb$asm2wasm$import i32))
  (import "env" "_puts" (func $_puts (param i32) (result i32)))
  (import "env" "memory" (memory $0 256))
  (import "env" "table" (table 0 anyfunc))
  (import "env" "memoryBase" (global $memoryBase i32))
  (import "env" "tableBase" (global $tableBase i32))
  (data (get_global $memoryBase) "hello, world!")
  (global $gb (mut i32) (get_global $gb$asm2wasm$import))
  (export "__post_instantiate" (func $__post_instantiate))
  (export "_main" (func $_main))
  (export "runPostSets" (func $runPostSets))
  (func $_main (result i32)
    (drop
      (call $_puts
        (i32.add
          (get_global $gb)
          (i32.const 0)
        )
      )
    )
    (i32.const 0)
  )
  (func $runPostSets
    (nop)
  )
  (func $__post_instantiate
    (call $runPostSets)
  )
)

Notes:

  • puts from libc is not linked in. You must provide it as an import.
  • The module imports the memory and table and offsets for them. (gb is also imported due to internal details in fastcomp, we should remove it probably.)
  • The string "hello, world!" is in a data segment, written to the memory offset.
  • The main method calls puts as expected. Note how it offsets the address of the string.
  • __post_instantiate is exported, but on this simple module it does nothing useful.