Skip to content

Commit

Permalink
Merge pull request #10 from dhil/wasmfx-merge
Browse files Browse the repository at this point in the history
Merge with WebAssembly/gc
  • Loading branch information
dhil authored Oct 20, 2023
2 parents 4f8d8c7 + d13dc88 commit a8ffd1f
Show file tree
Hide file tree
Showing 108 changed files with 14,233 additions and 1,505 deletions.
142 changes: 99 additions & 43 deletions document/core/appendix/algorithm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,21 +16,24 @@ Consequently, it can be integrated directly into a decoder.
The algorithm is expressed in typed pseudo code whose semantics is intended to be self-explanatory.


.. index:: value type, stack, label, frame, instruction
.. index:: value type, reference type, vector type, number type, packed type, field type, structure type, array type, function type, composite type, sub type, recursive type, defined type, stack, label, frame, instruction

Data Structures
~~~~~~~~~~~~~~~

Types
.....

Value types are representable as a set of enumerations.
Value types are representable as sets of enumerations:

.. code-block:: pseudo
type num_type = I32 | I64 | F32 | F64
type vec_type = V128
type heap_type = Def(idx : nat) | Func | Extern | Bot
type heap_type =
Any | Eq | I31 | Struct | Array | None |
Func | Nofunc | Extern | Noextern | Bot |
Def(def : def_type)
type ref_type = Ref(heap : heap_type, null : bool)
type val_type = num_type | vec_type | ref_type | Bot
Expand All @@ -43,38 +46,54 @@ Value types are representable as a set of enumerations.
func is_ref(t : val_type) : bool =
return not (is_num t || is_vec t) || t = Bot
Equivalence and subtyping checks can be defined on these types.
Similarly, :ref:`defined types <syntax-deftype>` :code:`def_type` can be represented:

.. code-block:: pseudo
func eq_def(n1, n2) =
... // check that both type definitions are equivalent (TODO)
func matches_null(null1 : bool, null2 : bool) : bool =
return null1 = null2 || null2
func matches_heap(t1 : heap_type, t2 : heap_type) : bool =
switch (t1, t2)
case (Def(n1), Def(n2))
return eq_def(n1, n2)
case (Def(_), Func)
return true
case (Bot, _)
return true
case (_, _)
return t1 = t2
func matches_ref(t1 : ref_type, t2 : ref_type) : bool =
return matches_heap(t1.heap, t2.heap) && matches_null(t1.null, t2.null)
func matches(t1 : val_type, t2 : val_type) : bool =
switch (t1, t2)
case (Ref(_), Ref(_))
return matches_ref(t1, t2)
case (Bot, _)
return true
case (_, _)
return t1 = t2
type packed_type = I8 | I16
type field_type = Field(val : val_type | packed_type, mut : bool)
type struct_type = Struct(fields : list(field_type))
type array_type = Array(fields : field_type)
type func_type = Func(params : list(val_type), results : list(val_type))
type comp_type = struct_type | array_type | func_type
type sub_type = Sub(super : list(def_type), body : comp_type, final : bool)
type rec_type = Rec(types : list(sub_type))
type def_type = Def(rec : rec_type, proj : int32)
func unpack_field(t : field_type) : val_type =
if (it = I8 || t = I16) return I32
return t
func expand_def(t : def_type) : comp_type =
return t.rec.types[t.proj].body
These representations assume that all types have been :ref:`closed <type-closed>` by :ref:`substituting <type-subst>` all :ref:`type indices <syntax-typeidx>` (in :ref:`concrete heap types <syntax-heaptype>` and in :ref:`sub types <syntax-subtype>`) with their respective :ref:`defined types <syntax-deftype>`.
This includes *recursive* references to enclosing :ref:`defined types <syntax-deftype>`,
such that type representations form graphs and may be *cyclic* for :ref:`recursive types <syntax-rectype>`.

We assume that all types have been *canonicalized*, such that equality on two type representations holds if and only if their :ref:`closures <type-closure>` are syntactically equivalent, making it a constant-time check.

.. note::
For the purpose of type canonicalization, recursive references from a :ref:`heap type <syntax-heaptype>` to an enclosing :ref:`recursive type <syntax-reftype>` (i.e., forward edges in the graph that form a cycle) need to be distinguished from references to previously defined types.
However, this distinction does not otherwise affect validation, so is ignored here.
In the graph representation, all recursive types are effectively infinitely :ref:`unrolled <aux-unroll-rectype>`.

We further assume that :ref:`validation <valid-valtype>` and :ref:`subtyping <match-valtype>` checks are defined on value types, as well as a few auxiliary functions on composite types:

.. code-block:: pseudo
func validate_val_type(t : val_type)
func validate_ref_type(t : ref_type)
func matches_val(t1 : val_type, t2 : val_type) : bool
func matches_ref(t1 : val_type, t2 : val_type) : bool
func is_func(t : comp_type) : bool
func is_struct(t : comp_type) : bool
func is_array(t : comp_type) : bool
Context
.......
Expand All @@ -84,6 +103,8 @@ For the purpose of presenting the algorithm, it is maintained in a set of global

.. code-block:: pseudo
var return_type : list(val_type)
var types : array(def_type)
var locals : array(val_type)
var locals_init : array(bool)
var globals : array(global_type)
Expand Down Expand Up @@ -142,7 +163,7 @@ However, these variables are not manipulated directly by the main checking funct
func pop_val(expect : val_type) : val_type =
let actual = pop_val()
error_if(not matches(actual, expect))
error_if(not matches_val(actual, expect))
return actual
func pop_num() : num_type | Bot =
Expand Down Expand Up @@ -249,7 +270,6 @@ it is an invariant of the validation algorithm that there always is at least one
However, a polymorphic stack cannot underflow, but instead generates :code:`Bot` types as needed.



.. index:: opcode

Validation of Opcode Sequences
Expand Down Expand Up @@ -354,15 +374,51 @@ Other instructions are checked in a similar manner.
push_vals(label_types(ctrls[n]))
push_val(Ref(rt.heap, false))
case (call_ref)
let rt = pop_ref()
if (rt.heap =/= Bot)
error_if(not is_def(rt.heap))
let ft = funcs[rt.heap.idx]
pop_vals(ft.params)
push_vals(ft.results)
case (br_on_cast n rt1 rt2)
validate_ref_type(rt1)
validate_ref_type(rt2)
pop_val(rt1)
push_val(rt2)
pop_vals(label_types(ctrls[n]))
push_vals(label_types(ctrls[n]))
pop_val(rt2)
push_val(diff_ref_type(rt2, rt1))
case (return)
pop_vals(return_types)
unreachable()
case (call_ref x)
let t = expand_def(types[x])
error_if(not is_func(t))
pop_vals(t.params)
pop_val(Ref(Def(types[x])))
push_vals(t.results)
case (return_call_ref x)
let t = expand_def(types[x])
error_if(not is_func(t))
pop_vals(t.params)
pop_val(Ref(Def(types[x])))
error_if(t.results.len() =/= return_types.len())
push_vals(t.results)
pop_vals(return_types)
unreachable()
case (struct.new x)
let t = expand_def(types[x])
error_if(not is_struct(t))
for (ti in reverse(t.fields))
pop_val(unpack_field(ti))
push_val(Ref(Def(types[x])))
case (struct.set x n)
let t = expand_def(types[x])
error_if(not is_struct(t) || n >= t.fields.len())
pop_val(Ref(Def(types[x])))
pop_val(unpack_field(st.fields[n]))
.. note::
It is an invariant under the current WebAssembly instruction set that an operand of :code:`Unknown` type is never duplicated on the stack.
It is an invariant under the current WebAssembly instruction set that an operand of :code:`Bot` type is never duplicated on the stack.
This would change if the language were extended with stack instructions like :code:`dup`.
Under such an extension, the above algorithm would need to be refined by replacing the :code:`Unknown` type with proper *type variables* to ensure that all uses are consistent.
Under such an extension, the above algorithm would need to be refined by replacing the :code:`Bot` type with proper *type variables* to ensure that all uses are consistent.
31 changes: 31 additions & 0 deletions document/core/appendix/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,34 @@ Added more precise types for references [#proposal-typedref]_.
* Extended :ref:`table definitions <syntax-table>` with optional initializer expression


.. index:: reference, reference type, heap type, field type, storage type, structure type, array type, composite type, sub type, recursive type

Garbage Collection
~~~~~~~~~~~~~~~~~~

Added managed reference types [#proposal-gc]_.

* New forms of :ref:`heap types <syntax-heaptype>`: |ANY|, |EQT|, |I31|, |STRUCT|, |ARRAY|, |NONE|, |NOFUNC|, |NOEXTERN|

* New :ref:`reference type <syntax-reftype>` short-hands: |ANYREF|, |EQREF|, |I31REF|, |STRUCTREF|, |ARRAYREF|, |NULLREF|, |NULLFUNCREF|, |NULLEXTERNREF|

* New forms of type definitions: :ref:`structure <syntax-structtype>` and :ref:`array types <syntax-arraytype>`, :ref:`sub types <syntax-subtype>`, and :ref:`recursive types <syntax-rectype>`

* Enriched :ref:`subtyping <match>` based on explicitly declared :ref:`sub types <syntax-subtype>` and the new heap types

* New generic :ref:`reference instructions <syntax-instr-ref>`: |REFEQ|, |REFTEST|, |REFCAST|, |BRONCAST|, |BRONCASTFAIL|

* New :ref:`reference instructions <syntax-instr-ref>` for :ref:`unboxed scalars <syntax-i31>`: |REFI31|, :math:`\I31GET\K{\_}\sx`

* New :ref:`reference instructions <syntax-instr-ref>` for :ref:`structure types <syntax-structtype>`: |STRUCTNEW|, |STRUCTNEWDEFAULT|, :math:`\STRUCTGET\K{\_}\sx^?`, |STRUCTSET|

* New :ref:`reference instructions <syntax-instr-ref>` for :ref:`array types <syntax-structtype>`: |ARRAYNEW|, |ARRAYNEWDEFAULT|, |ARRAYNEWFIXED|, |ARRAYNEWDATA|, |ARRAYNEWELEM|, :math:`\ARRAYGET\K{\_}\sx^?`, |ARRAYSET|, |ARRAYLEN|, |ARRAYFILL|, |ARRAYCOPY|, |ARRAYINITDATA|, |ARRAYINITELEM|

* New :ref:`reference instructions <syntax-instr-ref>` for converting :ref:`host types <syntax-externtype>`: |EXTERNINTERNALIZE|, |EXTERNEXTERNALIZE|

* Extended set of :ref:`constant instructions <valid-const>` with |REFI31|, |STRUCTNEW|, |STRUCTNEWDEFAULT|, |ARRAYNEW|, |ARRAYNEWDEFAULT|, |ARRAYNEWFIXED|, |EXTERNINTERNALIZE|, |EXTERNEXTERNALIZE|, and |GLOBALGET| for any previously declared immutable :ref:`global <syntax-global>`


.. [#proposal-signext]
https://github.com/WebAssembly/spec/tree/main/proposals/sign-extension-ops/
Expand All @@ -185,3 +213,6 @@ Added more precise types for references [#proposal-typedref]_.
.. [#proposal-typedref]
https://github.com/WebAssembly/spec/tree/main/proposals/function-references/
.. [#proposal-gc]
https://github.com/WebAssembly/spec/tree/main/proposals/gc/
34 changes: 34 additions & 0 deletions document/core/appendix/custom.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ Id Subsection
0 :ref:`module name <binary-modulenamesec>`
1 :ref:`function names <binary-funcnamesec>`
2 :ref:`local names <binary-localnamesec>`
4 :ref:`type names <binary-typenamesec>`
10 :ref:`field names <binary-fieldnamesec>`
== ===========================================

Each subsection may occur at most once, and in order of increasing id.
Expand Down Expand Up @@ -143,3 +145,35 @@ It consists of an :ref:`indirect name map <binary-indirectnamemap>` assigning lo
\production{local name subsection} & \Blocalnamesubsec &::=&
\Bnamesubsection_2(\Bindirectnamemap) \\
\end{array}
.. index:: type, type index
.. _binary-typenamesec:

Type Names
..............

The *type name subsection* has the id 4.
It consists of a :ref:`name map <binary-namemap>` assigning type names to :ref:`type indices <syntax-typeidx>`.

.. math::
\begin{array}{llclll}
\production{type name subsection} & \Btypenamesubsec &::=&
\Bnamesubsection_1(\Bnamemap) \\
\end{array}
.. index:: type, field, type index, field index
.. _binary-fieldnamesec:

Field Names
...........

The *field name subsection* has the id 10.
It consists of an :ref:`indirect name map <binary-indirectnamemap>` assigning field names to :ref:`field indices <syntax-fieldidx>` grouped by :ref:`type indices <syntax-typeidx>`.

.. math::
\begin{array}{llclll}
\production{field name subsection} & \Bfieldnamesubsec &::=&
\Bnamesubsection_2(\Bindirectnamemap) \\
\end{array}
54 changes: 53 additions & 1 deletion document/core/appendix/embedding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,19 @@ Hence, these syntactic classes can also be interpreted as types.
For numeric parameters, notation like :math:`n:\u32` is used to specify a symbolic name in addition to the respective value range.


.. _embed-bool:

Booleans
~~~~~~~~

Interface operation that are predicates return Boolean values:

.. math::
\begin{array}{llll}
\production{Boolean} & \bool &::=& \FALSE ~|~ \TRUE \\
\end{array}
.. _embed-error:

Errors
Expand Down Expand Up @@ -618,7 +631,7 @@ Globals
.. index:: reference, reference type
.. _embed-ref:
.. _embed-ref-type:

References
~~~~~~~~~~
Expand All @@ -641,3 +654,42 @@ References
In future versions of WebAssembly,
not all references may carry precise type information at run time.
In such cases, this function may return a less precise supertype.


.. index:: value type, external type, subtyping
.. _embed-match-valtype:
.. _embed-match-externtype:

Matching
~~~~~~~~

:math:`\F{match\_valtype}(\valtype_1, \valtype_2) : \bool`
..........................................................

1. Pre-condition: the :ref:`value types <syntax-valtype>` :math:`\valtype_1` and :math:`\valtype_2` are :ref:`valid <valid-valtype>` under the empty :ref:`context <context>`.

2. If :math:`\valtype_1` :ref:`matches <match-valtype>` :math:`\valtype_2`, then return :math:`\TRUE`.

3. Else, return :math:`\FALSE`.

.. math::
\begin{array}{lclll}
\F{match\_reftype}(t_1, t_2) &=& \TRUE && (\iff \vdashvaltypematch t_1 \matchesvaltype t_2) \\
\F{match\_reftype}(t_1, t_2) &=& \FALSE && (\otherwise) \\
\end{array}
:math:`\F{match\_externtype}(\externtype_1, \externtype_2) : \bool`
...................................................................

1. Pre-condition: the :ref:`extern types <syntax-externtype>` :math:`\externtype_1` and :math:`\externtype_2` are :ref:`valid <valid-externtype>` under the empty :ref:`context <context>`.

2. If :math:`\externtype_1` :ref:`matches <match-externtype>` :math:`\externtype_2`, then return :math:`\TRUE`.

3. Else, return :math:`\FALSE`.

.. math::
\begin{array}{lclll}
\F{match\_externtype}(\X{et}_1, \X{et}_2) &=& \TRUE && (\iff \vdashexterntypematch \X{et}_1 \matchesexterntype \X{et}_2) \\
\F{match\_externtype}(\X{et}_1, \X{et}_2) &=& \FALSE && (\otherwise) \\
\end{array}
Loading

0 comments on commit a8ffd1f

Please sign in to comment.