-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect execution of openjph compiled file with SIMD #4315
Comments
Also, works on wasmtime for ARM64 |
I've done some work to narrow this down but unfortunately I'm not being super successful at the moment. My strategy for narrowing this down has been:
Next I have a script which runs the binary on aarch64 and x86_64 with a binary-search-of-sorts. This pins down the precise fuel count where with one more fuel the execution differs. My goal here is to find the precise wasm instruction at which execution diverges here. This quickly-ish found results and I started sprinkling "debug stores" within the In doing this I've narrowed this down to function 40 in the input file above. Specifically these two instructions:
Unfortunately this Other than that though I have now hit a wall and am unable to make progress debugging this. Function 40 is huge and it's tough to explore the disassembly. Using I have confirmed, though, that 0.35.0 reproduces this issue so I don't believe this is a problem with recent developments like regalloc2, recent ISLE migrations, or the alias analysis pass added. Others who are more familiar with the backend may know more about how to debug this though. IIRC there's some optimizations around constants, constant pools, and trying to not always reify something into a register or something like that, and something may be going awry there. ... aaaand as I am writing this up I took another look at the definition of the I added a move from the input to a temporary register and that didn't actually fix the original program but it did surprisingly allow it to make more progress. I think that means there's actually at least two bugs here! Doing my bisection again I think I found a much simpler reproduction, namely: (module
(func (param v128 v128 i32) (result v128)
local.get 0
local.get 1
local.get 2
select
)) I don't think that our codegen for the
where the I'm running out of steam for tonight so I think this is as far as I'll get today. I'm actually quite worried that fuzzing hasn't discovered anything in this area. These are somewhat trivial bugs which in theory should be found almost immediately by any halfway decent fuzzing. |
This commit fixes a mistake in the `Swizzle` opcode implementation in the x64 backend of Cranelift. Previously an input register was casted to a writable register and then modified, which I believe instructions are not supposed to do. This was discovered as part of my investigation into bytecodealliance#4315.
This commit fixes a mistake in the `Swizzle` opcode implementation in the x64 backend of Cranelift. Previously an input register was casted to a writable register and then modified, which I believe instructions are not supposed to do. This was discovered as part of my investigation into #4315.
…e#4318) This commit fixes a mistake in the `Swizzle` opcode implementation in the x64 backend of Cranelift. Previously an input register was casted to a writable register and then modified, which I believe instructions are not supposed to do. This was discovered as part of my investigation into bytecodealliance#4315.
I realize now I forgot to say this earlier but thanks @yurydelendik for taking the time to file this! We'll be making an 0.38.1 release with these fixes soon and a few other fixes for Cranelift (unrelated to this). We also plan on filing a CVE about these micompiles because while they don't affect hosts themselves this could affect in-wasm execution which is often just as important in some environments. |
* x64: Fix codegen for the `i8x16.swizzle` instruction (#4318) This commit fixes a mistake in the `Swizzle` opcode implementation in the x64 backend of Cranelift. Previously an input register was casted to a writable register and then modified, which I believe instructions are not supposed to do. This was discovered as part of my investigation into #4315. * x64: Fix codegen for the `select` instruction with v128 (#4317) This commit fixes a bug in the previous codegen for the `select` instruction when the operations of the `select` were of the `v128` type. Previously teh `XmmCmove` instruction only stored an `OperandSize` of 32 or 64 for a 64 or 32-bit move, but this was also used for these 128-bit types which meant that when used the wrong move instruction was generated. The fix applied here is to store the whole `Type` being moved so the 128-bit variant can be selected as well.
I believe everything is now merged so I'm going to close this. |
…e#4318) This commit fixes a mistake in the `Swizzle` opcode implementation in the x64 backend of Cranelift. Previously an input register was casted to a writable register and then modified, which I believe instructions are not supposed to do. This was discovered as part of my investigation into bytecodealliance#4315.
While working on the SIMD tests at bytecodealliance/sightglass#189, I found that this file is not executed properly. The app informs:
Though expected something like:
It appears to be running on Node and SM, so I assume it is an issue of correctness. Attaching the compiled benchmark (minus bench_start/_end):
test-case.zip
P.S. on Intel CPU x64
The text was updated successfully, but these errors were encountered: