A wasm virtual machine will be used for validity predicates and transactions code.
The VM should provide:
- an interface for compiling from higher-level languages to wasm (initially only Rust)
- a wasm compiler, unless we use an interpreted runtime
- provide and inject environments for higher-level languages for VPs and transactions
- pre-process wasm modules
- check & sanitize modules
- inject gas metering
- inject stack height metering
- a runner for VPs and transaction code
- encode/decode wasm for transfer & storage
- manage runtime memory
- wasm development helpers
- helpers to estimate gas usage
- VM and environment versioning
- WebAssembly Specifications
- wasmer examples
- The WebAssembly Binary Toolkit
- bunch of useful wasm tools (e.g.
wasm2wat
to convert from wasm binary to human-readable wat format)
- bunch of useful wasm tools (e.g.
- Rust wasm WG and wasm book (some sections are JS specific)
- A practical guide to WebAssembly memory modulo JS specific details
- Learn X in Y minutes Where X=WebAssembly
The wasm environment will most likely be libraries that provide APIs for the wasm modules.
The common environment of VPs and transactions APIs:
- math & crypto
- logging
- panics/aborts
- gas metering
- storage read-only API
- context API (chain metadata such as block height)
The accounts sub-space storage is described under accounts' dynamic storage sub-space.
Because VPs are stateless, everything that is exposed in the VPs environment should be read-only:
- storage API to account sub-space the storage write log
- transaction API
- storage write access for all public state via the storage write log
Some exceptions as to what can be written are given under transaction execution.
The wasm memory allows to share data bi-directionally between the host (Rust shell) and the guest (wasm) through a wasm linear memory instance.
Because wasm currently only supports basic types, we need to choose how to represent more sophisticated data in memory.
The options on how the data can be passed through the memory are:
- using "C" structures (probably too invasive because everything in memory would have to use C repr)
- (de)serializing the data with some encoding (JSON, binary, ...?)
- currently very unstable: WebIDL / Interface Types / Reference types
The choice should allow for easy usage in wasm for users (e.g. in Rust a bindgen macro on data structures, similar to wasm_bindgen used for JS <-> wasm).
Related wasmer issue.
We're currently using borsh for storage serialization, which is also a good option for wasm memory.
- it's easy for users (can be derived)
- because borsh encoding is safe and consistent, the encoded bytes can also be used for Merkle tree hashing
- good performance, although it's not clear at this point if that may be negligible anyway
The data being passed between the host and the guest in the order of the execution:
- For transactions:
- host-to-guest: pass tx.data to tx.code call
- guest-to-host: parameters of environment functions calls, including storage modifications (pending on storage API)
- host-to-guest: return results for host calls
- For validity predicates:
- host-to-guest: pass tx.data, prior and posterior account storage sub-space state and/or storage modifications (i.e. a write log) for the account
- guest-to-host: parameters of environment function calls
- host-to-guest: return results for host calls
guest-to-host: the VP result (bool
) can be passed directly from the call
The storage write log gathers any storage updates (write
/delete
s) performed by transactions. For each transaction, the write log changes must be accepted by all the validity predicates that were triggered by these changes.
A validity predicate can read its prior state directly from storage as it is not changed by the transaction directly. For the posterior state, we first try to look-up the keys in the write log to try to find a new value if the key has been modified or deleted. If the key is not present in the write log, it means that the value has not changed and we can read it from storage.
The write log of each transaction included in a block and accepted by VPs is accumulated into the block write log. Once the block is committed, we apply the storage changes from the block write log to the persistent storage.
The two main options for implementing gas metering within wasm using wasmer are:
Both of these allow us to assign a gas cost for each wasm operation.
wasmer
gas middleware is more recent, so probably more risky. It injects the gas metering code into the wasm code, which is more efficient than host calls to a gas meter.
pwasm-utils
divides the wasm code into metered blocks. It performs host call with the gas cost of each block before it is executed. The gas metering injection is linear to the code size.
The pwasm-utils
seems like a safer option to begin with (and we'll probably need to use it for stack height metering too). We can look into switching to wasmer
middleware at later point.
For safety, we need to limit the stack height in wasm code. Similarly to gas metering, we can also use wasmer
middleware or pwasm-utils
.
We have to use pwasm-utils
, because wasmer
's stack limiter is currently non-deterministic (platform specific). This is to be fixed in this PR: wasmerio/wasmer#1037.