Skip to content

Commit

Permalink
Restore compatibility with upstream disas tests (bytecodealliance#135)
Browse files Browse the repository at this point in the history
It recently became necessary to regenerate the expected output of most
(all?) of the files in tests/disas. Concretely, we were generating
slightly different CLIF than upstream even on modules not involving any
WasmFX instructions.

This PR rectifies this situation, making sure that we generate identical
CLIF again. This required two changes:

## Global CLIF variable for runtime limits pointer
Since bytecodealliance#99, we called the function
`wasmtime_cranelift::func_environ::FuncEnvironment::declare_vmruntime_limits_ptr`
unconditionally for every compiled function, meaning that every function
prelude now contained a definition of the corresponding global variable.

This PR changes this behavior: I restored the original logic for when to
call `declare_vmruntime_limits_ptr`. Instead of using the global
variable `vmruntime_limits` to access the `VMRuntimeLimits`, we now load
this pointer ourselves once per `resume` instruction. Since the load
instruction is marked as `readonly`, it should be coalesced into a
single load, even if multiple `resume`s appear in the same function.

I've benchmarked this, using a micro-benchmark with 2 resume
instructions in the same function, and as expected there is no
observable difference in performance.

## Order of libcall declarations
In `builtin.rs`, we put the declarations of our own libcalls into the
middle of the list of all libcalls in the `foreach_builtin_function`
macro. This shifted the indices of some existing libcalls, which caused
the CLIF to differ when using these. I've moved all the `tc_` libcalls
to the end of the list, thus leaving the indices of existing libcalls
unchanged.

As a result, I've replaced all files in tests/disas with the versions
from upstream
(bytecodealliance/wasmtime@afaf1c7, the
most recent upstream commit that we've merged)
  • Loading branch information
frank-emrich authored Mar 19, 2024
1 parent c113ff9 commit cff2e76
Show file tree
Hide file tree
Showing 183 changed files with 3,155 additions and 3,893 deletions.
23 changes: 9 additions & 14 deletions crates/cranelift/src/func_environ.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2869,20 +2869,15 @@ impl<'module_environment> cranelift_wasm::FuncEnvironment for FuncEnvironment<'m
) -> WasmResult<()> {
// If the `vmruntime_limits_ptr` variable will get used then we initialize
// it here.
if true || self.tunables.consume_fuel || self.tunables.epoch_interruption {
// TODO(frank-emrich) This is now done unconditionally because we
// need the `vmruntime_limits_ptr` variable when translating resume.
// This has no runtime overhead: We are adding a load to every
// function, but if it is not actually used, cranelift's DCE pass
// will remove it. However, it would be nicer to check if the
// function actually contains resume instructions, and only run
// `declare_vmruntime_limits_ptr` then.
//
// TODO(dhil): FIXME emission of the vmruntime_limits_ptr
// affects codegen of non-wasmfx programs, causing CLIF
// output expectation tests (disas) to diverge from
// upstream. We should come up with a design that let us
// emit this pointer only when necessary.
if self.tunables.consume_fuel || self.tunables.epoch_interruption {
// TODO(frank-emrich) Ideally, we would like to use this global
// variable in the translation of `resume` instructions. However, in
// order to decide whether to declare the variable or not we would
// have to know if the function body contains a `resume` instruction
// before actually translating it. If we instead called
// `declare_vmruntime_limits_ptr` unconditionally, we would change
// the CLIF output of functions using no wasmfx instructions, which
// is undesirable.
self.declare_vmruntime_limits_ptr(builder);
}
// Additionally we initialize `fuel_var` if it will get used.
Expand Down
35 changes: 26 additions & 9 deletions crates/cranelift/src/wasmfx/optimized.rs
Original file line number Diff line number Diff line change
Expand Up @@ -819,6 +819,23 @@ pub(crate) mod typed_continuation_helpers {
let chain = StackChain::from_continuation(builder, contobj, self.pointer_type);
self.store_stack_chain(env, builder, &chain)
}

pub fn load_vm_runtime_limits_ptr<'a>(
&self,
env: &mut crate::func_environ::FuncEnvironment<'a>,
builder: &mut FunctionBuilder,
) -> ir::Value {
let pointer_type = env.pointer_type();
let vmctx = env.vmctx(builder.func);
let base = builder.ins().global_value(pointer_type, vmctx);
let offset = i32::try_from(env.offsets.vmctx_runtime_limits()).unwrap();

// The *pointer* to the VMRuntimeLimits does not change within the
// same function, allowing us to set the `read_only` flag.
let flags = ir::MemFlags::trusted().with_readonly();

builder.ins().load(pointer_type, flags, base, offset)
}
}

impl StackChain {
Expand Down Expand Up @@ -996,12 +1013,11 @@ pub(crate) mod typed_continuation_helpers {
&self,
env: &mut crate::func_environ::FuncEnvironment<'a>,
builder: &mut FunctionBuilder,
vmruntime_limits: cranelift_frontend::Variable,
vmruntime_limits_ptr: ir::Value,
) {
use wasmtime_continuations::offsets as o;

let stack_limits_ptr = self.get_stack_limits_ptr(env, builder);
let vmruntime_limits_ptr = builder.use_var(vmruntime_limits);

let memflags = ir::MemFlags::trusted();

Expand Down Expand Up @@ -1038,12 +1054,11 @@ pub(crate) mod typed_continuation_helpers {
&self,
env: &mut crate::func_environ::FuncEnvironment<'a>,
builder: &mut FunctionBuilder,
vmruntime_limits: cranelift_frontend::Variable,
vmruntime_limits_ptr: ir::Value,
) {
use wasmtime_continuations::offsets as o;

let stack_limits_ptr = self.get_stack_limits_ptr(env, builder);
let vmruntime_limits_ptr = builder.use_var(vmruntime_limits);

let memflags = ir::MemFlags::trusted();
let pointer_size = self.pointer_type.bytes() as u8;
Expand Down Expand Up @@ -1339,7 +1354,7 @@ pub(crate) fn translate_resume<'a>(
// to resume objects other than `original_contobj`.
// We make the continuation object that was actually resumed available via
// `resumed_contobj`, so that subsequent blocks can refer to it.
let resume_result = {
let (resume_result, vm_runtime_limits_ptr) = {
builder.switch_to_block(resume_block);

// We mark `resume_contobj` as the currently running one
Expand Down Expand Up @@ -1405,14 +1420,16 @@ pub(crate) fn translate_resume<'a>(
emit_debug_println!(env, builder, "[resume] discriminant is {}", discriminant);
}

let vm_runtime_limits_ptr = vmctx.load_vm_runtime_limits_ptr(env, builder);

// Jump to the return block if the result is 0, otherwise jump to
// the suspend block.
builder
.ins()
.brif(result, suspend_block, &[], return_block, &[]);

// We do not seal this block, yet, because the effect forwarding block has a back edge to it
result
(result, vm_runtime_limits_ptr)
};

// Suspend block.
Expand All @@ -1423,11 +1440,11 @@ pub(crate) fn translate_resume<'a>(
// We store parts of the VMRuntimeLimits into the continuation that just suspended.
let suspended_chain =
tc::StackChain::from_continuation(builder, resume_contobj, env.pointer_type());
suspended_chain.load_limits_from_vmcontext(env, builder, env.vmruntime_limits_ptr);
suspended_chain.load_limits_from_vmcontext(env, builder, vm_runtime_limits_ptr);

// Afterwards (!), restore parts of the VMRuntimeLimits from the
// parent of the suspended continuation (which is now active).
parent_stack_chain.write_limits_to_vmcontext(env, builder, env.vmruntime_limits_ptr);
parent_stack_chain.write_limits_to_vmcontext(env, builder, vm_runtime_limits_ptr);

let discriminant = builder.ins().ireduce(I32, resume_result);
let discriminant_size_bytes =
Expand Down Expand Up @@ -1571,7 +1588,7 @@ pub(crate) fn translate_resume<'a>(

// Restore parts of the VMRuntimeLimits from the
// parent of the returned continuation (which is now active).
parent_stack_chain.write_limits_to_vmcontext(env, builder, env.vmruntime_limits_ptr);
parent_stack_chain.write_limits_to_vmcontext(env, builder, vm_runtime_limits_ptr);

let co = tc::ContinuationObject::new(resume_contobj, env.pointer_type());
co.set_state(builder, wasmtime_continuations::State::Returned);
Expand Down
85 changes: 43 additions & 42 deletions crates/environ/src/builtin.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,49 @@ macro_rules! foreach_builtin_function {
/// Invoked when we reach a new epoch.
new_epoch(vmctx: vmctx) -> i64;

/// Invoked before malloc returns.
check_malloc(vmctx: vmctx, addr: i32, len: i32) -> i32;
/// Invoked before the free returns.
check_free(vmctx: vmctx, addr: i32) -> i32;
/// Invoked before a load is executed.
check_load(vmctx: vmctx, num_bytes: i32, addr: i32, offset: i32) -> i32;
/// Invoked before a store is executed.
check_store(vmctx: vmctx, num_bytes: i32, addr: i32, offset: i32) -> i32;
/// Invoked after malloc is called.
malloc_start(vmctx: vmctx);
/// Invoked after free is called.
free_start(vmctx: vmctx);
/// Invoked when wasm stack pointer is updated.
update_stack_pointer(vmctx: vmctx, value: i32);
/// Invoked before memory.grow is called.
update_mem_size(vmctx: vmctx, num_bytes: i32);

/// Returns an index to drop a `VMExternRef`.
#[cfg(feature = "gc")]
drop_externref(vmctx: vmctx, val: pointer);

/// Returns an index to do a GC and then insert a `VMExternRef` into the
/// `VMExternRefActivationsTable`.
#[cfg(feature = "gc")]
activations_table_insert_with_gc(vmctx: vmctx, val: reference);

/// Returns an index for Wasm's `global.get` instruction for `externref`s.
#[cfg(feature = "gc")]
externref_global_get(vmctx: vmctx, global: i32) -> reference;

/// Returns an index for Wasm's `global.get` instruction for `externref`s.
#[cfg(feature = "gc")]
externref_global_set(vmctx: vmctx, global: i32, val: reference);

/// Returns an index for Wasm's `table.grow` instruction for `externref`s.
#[cfg(feature = "gc")]
table_grow_externref(vmctx: vmctx, table: i32, delta: i32, init: reference) -> i32;

/// Returns an index for Wasm's `table.fill` instruction for `externref`s.
#[cfg(feature = "gc")]
table_fill_externref(vmctx: vmctx, table: i32, dst: i32, val: reference, len: i32);


/// Creates a new continuation from a funcref.
tc_cont_new(vmctx: vmctx, r: pointer, param_count: i32, result_count: i32) -> pointer;
/// Resumes a continuation. The result value is of type
Expand Down Expand Up @@ -109,48 +152,6 @@ macro_rules! foreach_builtin_function {
tc_print_int(vmctx: vmctx, arg : i64);
/// TODO
tc_print_pointer(vmctx: vmctx, arg : pointer);

/// Invoked before malloc returns.
check_malloc(vmctx: vmctx, addr: i32, len: i32) -> i32;
/// Invoked before the free returns.
check_free(vmctx: vmctx, addr: i32) -> i32;
/// Invoked before a load is executed.
check_load(vmctx: vmctx, num_bytes: i32, addr: i32, offset: i32) -> i32;
/// Invoked before a store is executed.
check_store(vmctx: vmctx, num_bytes: i32, addr: i32, offset: i32) -> i32;
/// Invoked after malloc is called.
malloc_start(vmctx: vmctx);
/// Invoked after free is called.
free_start(vmctx: vmctx);
/// Invoked when wasm stack pointer is updated.
update_stack_pointer(vmctx: vmctx, value: i32);
/// Invoked before memory.grow is called.
update_mem_size(vmctx: vmctx, num_bytes: i32);

/// Returns an index to drop a `VMExternRef`.
#[cfg(feature = "gc")]
drop_externref(vmctx: vmctx, val: pointer);

/// Returns an index to do a GC and then insert a `VMExternRef` into the
/// `VMExternRefActivationsTable`.
#[cfg(feature = "gc")]
activations_table_insert_with_gc(vmctx: vmctx, val: reference);

/// Returns an index for Wasm's `global.get` instruction for `externref`s.
#[cfg(feature = "gc")]
externref_global_get(vmctx: vmctx, global: i32) -> reference;

/// Returns an index for Wasm's `global.get` instruction for `externref`s.
#[cfg(feature = "gc")]
externref_global_set(vmctx: vmctx, global: i32, val: reference);

/// Returns an index for Wasm's `table.grow` instruction for `externref`s.
#[cfg(feature = "gc")]
table_grow_externref(vmctx: vmctx, table: i32, delta: i32, init: reference) -> i32;

/// Returns an index for Wasm's `table.fill` instruction for `externref`s.
#[cfg(feature = "gc")]
table_fill_externref(vmctx: vmctx, table: i32, dst: i32, val: reference, len: i32);
}
};
}
Expand Down
15 changes: 6 additions & 9 deletions tests/disas/arith.wat
Original file line number Diff line number Diff line change
Expand Up @@ -18,26 +18,23 @@
;; gv0 = vmctx
;; gv1 = load.i64 notrap aligned readonly gv0+8
;; gv2 = load.i64 notrap aligned gv1
;; gv3 = vmctx
;; sig0 = (i64 vmctx, i32 uext, i32 uext) -> i32 uext system_v
;; sig1 = (i64 vmctx, i32 uext) -> i32 uext system_v
;; stack_limit = gv2
;;
;; block0(v0: i64, v1: i64):
;; @001f v2 = iconst.i32 0
;; @001f v3 = global_value.i64 gv3
;; @001f v4 = load.i64 notrap aligned v3+8
;; @0021 v5 = iconst.i32 4
;; @0023 v6 = iconst.i32 4
;; @0025 v7 = isub v5, v6 ; v5 = 4, v6 = 4
;; @002a brif v7, block2, block4
;; @0021 v3 = iconst.i32 4
;; @0023 v4 = iconst.i32 4
;; @0025 v5 = isub v3, v4 ; v3 = 4, v4 = 4
;; @002a brif v5, block2, block4
;;
;; block2:
;; @002c trap unreachable
;;
;; block4:
;; @002e v8 = iconst.i32 6
;; @0032 v9 = imul v8, v7 ; v8 = 6
;; @002e v6 = iconst.i32 6
;; @0032 v7 = imul v6, v5 ; v6 = 6
;; @0034 jump block3
;;
;; block3:
Expand Down
22 changes: 10 additions & 12 deletions tests/disas/basic-wat-test.wat
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,16 @@
;; stack_limit = gv2
;;
;; block0(v0: i64, v1: i64, v2: i32, v3: i32):
;; @001e v5 = global_value.i64 gv3
;; @001e v6 = load.i64 notrap aligned v5+8
;; @0021 v7 = uextend.i64 v2
;; @0021 v8 = global_value.i64 gv4
;; @0021 v9 = iadd v8, v7
;; @0021 v10 = load.i32 little heap v9
;; @0026 v11 = uextend.i64 v3
;; @0026 v12 = global_value.i64 gv4
;; @0026 v13 = iadd v12, v11
;; @0026 v14 = load.i32 little heap v13
;; @0029 v15 = iadd v10, v14
;; @002a jump block1(v15)
;; @0021 v5 = uextend.i64 v2
;; @0021 v6 = global_value.i64 gv4
;; @0021 v7 = iadd v6, v5
;; @0021 v8 = load.i32 little heap v7
;; @0026 v9 = uextend.i64 v3
;; @0026 v10 = global_value.i64 gv4
;; @0026 v11 = iadd v10, v9
;; @0026 v12 = load.i32 little heap v11
;; @0029 v13 = iadd v8, v12
;; @002a jump block1(v13)
;;
;; block1(v4: i32):
;; @002a return v4
Expand Down
Loading

0 comments on commit cff2e76

Please sign in to comment.