Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from bytecodealliance:main #61

Open
wants to merge 95 commits into
base: main
Choose a base branch
from

Conversation

pull[bot]
Copy link

@pull pull bot commented Jan 16, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

jeffcharles and others added 6 commits January 16, 2025 15:19
* implement rmw sub for x64 in winch

* fmt

* fix tests after rebase

* turn match into if-let

* fix test
Previously a limit was added which prevented more than N growths over
time but this wasn't sufficient to prevent a test case from continuously
growing memory by just enough that it never went over N but the byte
sizes in question were big enough that the fuzz test case timed out on
OSS-Fuzz.

This commit changes the check to limit "bytes moved" instead of the
quantity of growths over time. This is a coarse approximation of what's
happening but should hopefully still allow interesting behavior while
additionally ensuring we don't spent the whole time moving around
gigabytes of data.
* pulley: Support smaller int-to-float conversions

This adds support for converting 8/16-bit integers to floats to Pulley.
This is not directly accessible from wasm instructions but is possible
through various optimizations to create. A new `*.clif` runtest was
added exercising many combinations of scalars-to-scalars for
int-to-floats of varying widths.

* Include new test
This commit fixes an issue in the Pulley backend for the
`return_call_indirect` implementation. This brings Pulley in line with
other backends to use a fixed register for the indirect call location
which is caller-save instead of possibly using a callee-save register.
If a callee-save-register is used then the location to be jumped to is
clobbered by register restores and won't have the correct location to
jump to.

This additionally required updating the Pulley ABI slightly. Previously
all caller-saved registers were considered argument registers meaning
that there weren't any actual available registers to use for the jump
destination. To handle this I've decreased the number of argument
registers by 1 so there's a single register used for the
return-call-indirect destination available which is also caller-saved.
* Winch: Add splat instructions to x64 using AVX2

* Change vpbroadcast size to lane size

* Create helper for vpshuf mask

* Change to single masm method for splatting

* Split out SplatLoadKind from SplatKind

* Add comment about heap allocation
@pull pull bot added the ⤵️ pull label Jan 16, 2025
alexcrichton and others added 23 commits January 16, 2025 20:11
The intention has always been to disable parallel compilation during
fuzzing and this was previously achieved with a single-thread pool for
Rayon. This still spawns a rayon thread though and can offload work to
it, so this instead explicitly uses `wasmtime::Config` to disable
parallel compilation. This should ensure that `rayon` doesn't get used
at runtime, as desired, and additionally avoids spawning threads or
offloading work onto a thread.
* pulley: Refactor register restores on tail-calls

Use helpers from `abi.rs` where possible to avoid duplicating logic.

* Shuffle where stack increments from tail args happens
* split IoView trait off of WasiView

* move wasi-http over to split views

* config and keyvalue: no changes to libraries, just tests where WasiView used

* wasmtime-cli: fixes for IoView/WasiView split

* move rest of wasi-io impl to be on IoImpl

* wasi-http: linker with wasi IoImpl

* wasi-nn tests: IoView impl
* avoid emitting 32->64 extend on rmw ops on x64

* review edit
* Add basic support for profiling Pulley

This commit adds basic support for profiling the Pulley interpreter.
This is partially achievable previously through the use of native
profilers, but the downside of that approach is that you can find hot
instructions but it's not clear in what context the hot instructions are
being executed nor what functions are hot. The goal of this profiler is
to show pulley bytecode and time spent in bytecode itself to better
understand the shape of code around a hot instruction to identify new
macro opcodes for example.

The general structure of this new profiler is:

* There is a compile-time feature for Pulley which is off-by-default
  where, when enabled, Pulley will record its current program counter
  into an `AtomicUsize` before each instruction.

* When the CLI has `--profile pulley` Wasmtime will spawn a sampling
  thread in the same process which will periodically read from this
  `AtomicUsize` to record where the program is currently executing.

* The Pulley profiler additionally records all bytecode through the use
  of the `ProfilingAgent` trait to ensure that the recording has access
  to all bytecode as well.

* Samples are taken throughout the process and emitted to a
  `pulley-$pid.data` file. This file is then interpreted and printed by
  an "example" program `profiler-html.rs` in the `pulley/examples`
  directory.

The end result is that hot functions of Pulley bytecode can be seen and
instructions are annotated with how frequently they were executed. This
enables finding hot loops and understanding more about the whole loop,
bytecodes that were selected, and such.

* Add missing source file

* Check the profile-pulley feature in CI

* Miscellaneous fixes for CI

* Fix type-checking of `become` on nightly Rust

* Fix more misc CI issues

* Fix dispatch in tail loop

* Update test expectations

* Review comments

* Fix a feature combo
* implement x64 xchg

* fix test post rebase
* Optimize MIRI execution time in CI

* Disable Cranelift optimizations and use single_pass register
  allocation by default.

* Ignore a number of tests that are compiling wasm which we generally
  don't want to do under MIRI.

* Fix CI build
We forgot to do this quite a long time ago but this was always the
intention. Turns out with Pulley now being online the fuzzer quickly
found a difference between Pulley and Cranelift where the native x64
instructions differ from the "deterministic" behavior that Pulley
implements. This difference is expected and allowed though, so don't
fuzz it.
* implement atomic rmw and, or, xor

implement rmw or

implement atomic rmw xor

fix operand sizes

implement and, or, xor

update rmw or tests

update rmw xor tests

fmt

use ad-hoc conversion for AtomicRmwSeqOp

fix test

fix rebae quirks

* cleanup tests
I noticed that CI is failing given that an audit and policy for
`wasmtime-math` is missing.

`wasmtime-math` was introduced in
https://github.com/bytecodealliance/wasmtime/pull/9808/files.

I followed a similar approach to what it's used for all the other
`wasmtime-*` crates.
* winch: Tighten the definition of `ExtendKind`

Some instructions, like the atomic ones, part of the threads proposal,
we want to ensure through the type system that only supported extend
kinds are passed, that way the emission layer can ignore the extend kind
validation at runtime.

This commit divides the existing `ExtendKind` enum into the signed and
unsigned variants, making the definition more amenable to atomic
operations.

* Implement conversions between specialized extend kinds

* Lint fixes
This commit fixes a fuzz-bug. See the test case for details.

Prior to this commit, imported global addresses were calculated with
register offset addressing, using the scratch register as the base. With
imported globals, the caller must load the address into an allocatable
register which implies that in presence of spills the scratch register
would get clobbered, affecting the previously loaded imported global
address.

This commit fixes the issue by returning an allocatable register, along
with the offset and global type, which is expected to be freed by the
caller after emitting the global load or store.
* wasmtime_test: Rename "Cranelift" strategy to "CraneliftNative"

* wasmtime_test: Add CraneliftPulley to default test strategies

* wasmtime_test: Use one specific compilation strategy with `only` specifier.

Tests in `tests/all/fuel.rs` and `tests/all/winch_engine_features.rs`
were using `#[wasmtime_test(strategies(not(Cranelift)))]` to gate their
Winch specific tests. Now that there is a third compilation strategy,
those tests were failing against Pulley.

I've replaced those with `#[wasmtime_test(strategies(only(Winch)))]`
to be more clear that they are targeted specifically at Winch.

* wasmtime_test: Fix flaky `wrap_and_typed_i31ref` test in tests/all/func.rs

The `static HITS` variable was sharing state between tests,
causing them to clobber each other when ran together.
This commit fixes a bug in the WASIp1-to-WASIp2 adapter during
`fd_prestat_dir_name` where an iterator variable was forgotten to be
incremented. That means that getting the path for anything other than
the first preopen didn't work correctly.

Closes #10058
* Fuzzing: Keep AVX flags enabled for Winch

* Check that SIMD is enabled

* Inspect flags in wast oracle

* Add log statement
* tighter extends

* Even thighter extends

* rename src/dst_size to from/to_size
* pulley: Implement full 128-bit multiplication

While Pulley has lowering rules for widening multiplication it didn't
have a rule for a full 128-bit multiplication which is possible to
generate through CLIF optimizations given wasm input. This commit adds
such a lowering to the Cranelift backend but doesn't add any new
instructions yet under the assumption this probably isn't perf-critical
at this time.

* Don't use a fallible `amode_add`
alexcrichton and others added 30 commits January 27, 2025 21:34
* Make it easier to reuse fuzzing configuration on the CLI

During fuzzing we emit a debug log of configured options to assist with
reproducing fuzz test cases outside of the fuzzing harness. Mapping
options the fuzzer is using though to CLI flags, for example, is a bit
of an art though and not obvious. Just today we've got a fuzz bug and I
couldn't figure out how to reproduce on the CLI and it turns out the
issue was that I was forgetting a flag that was being configured in
response to another flag. I got a bit fed up with constantly trying to
map one to the other, so I've decided to fix things.

This commit adds a `Display for CommonOptions` implementation to the
`wasmtime-cli-flags` crate. This is built on the same macro-construction
infrastructure of all our flags making it a relatively low one-time-cost
to implement this. Each option value now implements not only parsing but
printing as well.

Next the `wasmtime-fuzzing` crate was updated to create a
`CommonOptions` first which is then in turn used to create a
`wasmtime::Config`. This provides a layer to insert a log statement with
to emit all configuration options in a form that can be easily
copy/pasted to the CLI to reproduce.

Overall after doing this I was able to quickly reproduce the bug in
question (yay!). The CLI flag logging is pretty verbose right now since
the fuzzing infrastructure sets many settings redundantly to their
defaults, but reducing flags to a minimum is expected to be relatively
easy compared to otherwise trying to extract the options.

* Fix build and dependencies from wasmtime-fuzzing

* Always provide `wasmtime::MpkEnabled`

It's an otherwise very small `enum` which makes it easier to
conditionally compile `wasmtime-cli-flags`

* Frob some crate features some more

* Fix specification of `wasmtime_linkopt_force_jump_veneer` option
* refactor: unify how bits are accessed in `cranelift-entity`

While using `MachLabel`, a `cranelift-entity`-created type, I noticed
that there were three ways to access the contained bits: `.get()`,
`.as_u32()`, and `.as_bits()`. All performed essentially the same
function and it was unclear which to use.

This change removes `MachLabel::get()`, replacing it with `as_u32()`.
It also replaces all uses of `from_bits()` and `as_bits()` with
`from_u32()` and `as_u32()`. Why? I would have preferred the "bits"
naming since it seems more clear ("just unwrap this thing") and it could
avoid a large rename if the type were changed in the future, I realized
that there are vastly more uses of the "u32" naming that already
exist--it's just easier.

While this refactoring _should_ result in no functional change, you may
notice a couple of failing tests related to a pre-existing check on
`from_u32` that did not exist on `from_bits`. For some reason,
`from_u32` asserted that we would never pick `u32::MAX` for an entity
value; unfortunately, some parsing code, `decode_narrow_field`, does
just this. Why did we have such an assertion in the first place? Is it
still needed? Should `decode_narrow_field` do something else?

* Re-add `from_bits`, `as_bits` and uses

* doc: tweak doc comment
* Create wasmtime::Config from toml, add --config CLI flag

* Add cli tests
* Cut down on serde requirements in `wasmtime` crate

Remove `derive(Deserialize, Serialize)` for some configuration-based
enums to ideally give us more flexibility in the future as to what these
are exactly called. I'm not sure of any internal users of these except
for one use due to #9811 which can be switched to using an option-based
parser to ensure the same syntax is accepted via `--config` and on the
CLI with a flag.

* Remove unused imports

* Fix option parsing
* Improve support for completely unknown architectures

This commit is a step in the direction of trying to make Wasmtime more
portable by default. The goal here is to enable Wasmtime to compile for
architectures that it has no prior knowledge of. There's a few
miscellaneous locations through Wasmtime where we need
architecture-specific intrinsics and such but that's all in service of
Cranelift itself. Without Cranelift support none of them are necessary.

This commit plumbs a custom `#[cfg]` from Wasmtime's `build.rs` script
into the crate about whether there's a supported Cranelift backend. If
this isn't available some architecture-specific intrinsics are turned
off and not included. An example is that `vm::arch` entirely disappears
which is only in service of `UnwindHost`, which also disappears.
Furthermore the `helpers.c` file also entirely disappears as it's not
necessary on unknown architectures.

To help keep this working I've added CI to build Wasmtime for
`powerpc64le-unknown-linux-gnu`. Wasmtime currently has no support for
this architecture, although if it grows such support in the future
this'll need to be changed to some other unsupported architecture.

* Review feedback

* Fix powerpc build

* Refactor windows trap handling to look like Unix

Shuffle some files around to be more amenable to #[cfg]

* Move signal-handling tests to wasmtime crate

That way it's got easy access to the #[cfg]'s from the build script

* Disable signals support without a host compiler

Even if custom signals are enabled, don't compile it in.

prtest:full

* Fix windows unused imports

* Fix unused imports on Windows

* Remove untested stubs for arch intrinsics

These aren't needed any more to compile Pulley

* Defer tunables validation to loading modules

Instead of validating at `Engine` config time instead validate at
`Module` config time to enable cross-compilation.

* Skip `Tunables` auto-configuration if cross-compiling

This commit

* Tweak some Tunables based on Pulley

Ensures that specific `--target pulleyNN` works most of the time.

* Update host_segfault.rs to handle new 32-bit defaults

No signal handlers are used at all with Pulley so when the async stack
overflows there's no message printed any more.

* Disable Tunables::signals_based_traps on miri
* Cranelift/x64 backend: improve pointer width assert. (#10118)

This adds additional information as to what is wrong when this assert is encountered.

* Cranelift/docs: Change pointer width in ir docs to 64 bits (#10118)

Most people using this example will likely be on 64 bit systems, so it makes
sense to target a system with 64 bit addressses with the example.
* Start work towards always-warnings in `wasmtime`

This commit is an attempt to make progress after #10108. The goal of
this commit is to create a list of features to burn down over time and
to have `dead_code` and `unused_imports` lints on-by-default in
Wasmtime. The `feature = "default"` gate on this `allow` directive was
replaced with individual features. The hope is that as individual
features are removed from this list we can fix the resulting warnings
and the commit such a ratchet.

* Burn down some features from allowing dead code

Remove the features from the crate and test compiling with all other
features to ensure that no warnings are issued.

* Run full tests in CI

prtest:full

* Fix warning condition

* Fix #[cfg] to fix warnings on iOS

Switch `target_os = "macos"` to `target_vendor = "apple"` to align
requirements.
* Update cli book docs for toml option

* Make TOML keys use kebab-case in order to match cli names
In the spirit of covering more libcalls and exercising more wasm this
adds execution of table/memory intrinsics. This helped uncover some
stacked-borrows unsoundness in how our `table.copy` intrinsic is
implemented which was fixed with a new carefully typed method on
`PrimaryMap`.
#10128)

* add wasmtime-wasi-io and custom async executor to min-platform example

* make it possible to find example from wasmtime-wasi-io docs

prtest:full

* fix comment

* add wasm32-wasip2 target for min-platform ci

and enable signals based traps when running with wasi disabled,
because at the moment without signals based traps no native code can be
loaded so the embedding never actually executes wasm. this ensures the
heap size setting when not(feature = "wasi") is checked by execution

* fix cbindgen version in ci
* Enable warnings if `pooling-allocator` is disabled

Continuation of work in #10131

* Fix windows build
* v128.not

* v128.and

* v128.andnot

* v128.or

* rename test files

* v128.xor

* enable spec tests

* v128.bitselect

* v128.any_true

* v128.load*_lane

* v128.load*_lane

* cleanup duplicate methods

* move lane/load to wasm_store/load

* rename v128 functions

* ensure avx support

* fmt

* fix merge blips

* fix unsupported tests

* fix missing avx checks
* Support platforms without 64-bit atomics

This commit enables Wasmtime to build on platforms without 64-bit atomic
instructions, such as Rust's `riscv32imac-unknown-none-elf` target.
There are only two users of 64-bit atomics right now which are epochs
and allocation of `StoreId`. This commit adds `#[cfg]` to epoch-related
infrastructure in the runtime to turn that all of if 64-bit atomics
aren't available. The thinking is that epochs more-or-less don't work
without 64-bit atomics so it's easier to just remove them entirely.

Allocation of `StoreId` is trickier though because it's so core to
Wasmtime and it basically can't be removed. I've opted to change the
allocator to 32-bit indices instead of 64-bit indices. Note that
`StoreId` requires unique IDs to be allocated for safety which means
that while a 64-bit integer won't overflow for a few thousand years a
32-bit integer will overflow in a few hours from quickly creating
stores. The rough hope though is that embeddings on platforms like this
aren't churning through stores. Regardless if this condition is
triggered it'll result in a panic rather than unsoundness, so we
hopefully have at least that going for us.

Closes #8768

* Update component resources to not use `AtomicU64`

These aren't intended to be used in threaded contexts anyway and the use
of `AtomicXXX` is just to have interior mutability while still being
`Send` and `Sync`. Switch to using `AtomicU32` which is more portable.

* Use `RwLock<u64>` for `StoreId` instead.

* Fix compile

* Fix imports
* Add a style guideline for conditional compilation

I've been doing a fair amount of work recently that's touched on
`#[cfg]` in many ways throughout Wasmtime. I've personally got opinions
about how to structure everything as well, and these opinions are not
always obvious to others or discerned from just reading snippets. To
assist with this I figured it would be nice to have a style guideline
for Wasmtime's conditionally compiled code explaining at least at a high
level what's going on and some rough basic principles.

I've attempted to give this a stab and have added a page to the
contributing documentation about the style guidelines for conditional
compilation. I'm sure I've forgotten something here but my hope is that
we can evolve this over time.

* Fix typo

* Fix code examples
* Enable warnings if `async` is disabled

Continuation of work in #10131.

This required a number of organizational changes to help cut down on
`#[cfg]`, notably lots of async-related pieces from `store.rs` have
moved to `store/async_.rs` to avoid having lots of conditional imports.
Additionally this removes all of the `#[cfg]` annotations on those
methods already.

Additionally the signature of `AsyncCx::block_on` was updated to be a
bit more general to ideally remove the need for `Pin` but it ended up
not panning out quite just yet. In the future it should be possible to
remove the need for `Pin` at callsites though.

* Rebase conflicts
This commit marks another step toward finalizing AArch64 support in
Winch.

While enabling spec tests, I experienced some unexpected failures
related to Wasm loads/stores and traps. The observed
symptoms are as follows:

* Under normal conditions, Wasm loads/stores work as expected.
* In out-of-bounds scenarios, loads/stores result in a segmentation
  fault, whereas the expected behavior is to trigger an out-of-bounds trap.
* When out-of-bounds access can be determined statically, the program
  still results in a segmentation fault instead of the anticipated
  out-of-bounds trap.

Debugging revealed the following issues:

* The stack pointer was not correctly aligned to 16 bytes when entering
  signal handlers, which caused the segmentation fault.
* Wasm loads and stores were not flagged as untrusted, leading to
  segmentation faults even when the stack pointer was properly aligned.

This commit fixes the previous issues by:

* Correctly flagging wasm loads and stores as untrusted.
* Reworking the shadow stack pointer approach such that it allows
  aligning the stack pointer at arbitrary points in the program,
  particularly where signal handling might be needed. This rework
  involves changing some principles introduced in
  #5652; namely:
  changing the primary stack pointer register to be the shadow stack
  pointer. See the updates comments in the code for more details.

Note that this change doesn't enable spectests. To try this change, run:

  cargo run -- wast -Ccompiler=winch tests/spec_testsuite/address.wast
* Winch: Add SIMD comparison instructions for x64 with AVX

* Move passing test out of unsupported list
While it's predicted that `no_std` builds won't be using threading
that's not necessarily a given. Add a note to the documentation to call
out that `no_std` builds panic on contention with a request for filing
an issue if that's not suitable.
* Enable warnings if `gc` is disabled

Continuation of work in #10131. This additionally handles turning off
`gc-null` and `gc-drc` and the various combinations within.

* Fix some more warnings

* Fix a feature combo build
* [x] Test are run in CI.
* [x] Complete implementation of all Tier 1 proposals. (except
  `threads`, intentionally)
* [x] Expected that all maintainers will have to handle minor changes
  affecting Pulley.
* [x] Future major changes requiring an RFC required to consider Pulley.
* [x] [Continuous fuzzing][fuzz1], and tested [to get fuzzed][fuzz2]
  * [x] [Custom fuzz target][fuzz-target]
* [x] Implementation ready to have CVEs assigned if necessary.
* [x] Reasonable to expect most developers to assist in maintenance.
* [x] RFC required for future changes affecting APIs/implementation.
* [x] [Stage 4 WebAssembly proposal](https://github.com/WebAssembly/proposals)
* [x] No known bugs/open questions.
* [x] [Available in Wasmtime API][rust-api]
* [x] [Available in Wasmtime C API][c-api]

[fuzz1]: https://github.com/bytecodealliance/wasmtime/blob/e60b6e623bca1203cf4b9c844254c25ccebe9d6d/crates/fuzzing/src/generators/module.rs#L49
[fuzz2]: https://github.com/bytecodealliance/wasmtime/blob/e60b6e623bca1203cf4b9c844254c25ccebe9d6d/crates/fuzzing/src/oracles.rs#L1311
[fuzz-target]: https://github.com/bytecodealliance/wasmtime/blob/e60b6e623bca1203cf4b9c844254c25ccebe9d6d/crates/fuzzing/src/oracles/memory.rs#L20-L23
[rust-api]: https://docs.rs/wasmtime/latest/wasmtime/struct.MemoryTypeBuilder.html#method.memory64
[c-api]: https://docs.wasmtime.dev/c-api/memory_8h.html#a3db783fad4992c3f6985c92274b14ad7
* Update wave.rs

* Update store.rs

* Update table.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.