Implement Vpopcnt for x64 #2887

jlb6740 · 2021-05-10T18:11:27Z

No description provided.

cranelift/codegen/src/isa/x64/lower.rs

abrown · 2021-05-13T20:10:31Z

cranelift/codegen/meta/src/shared/instructions.rs

+        )
+        .operands_in(vec![x])
+        .operands_out(vec![a]),
+    );


Why don't we just use popcnt? If popcnt and vpopcnt have the same semantics then we can just use popcnt with a vector type?

It's been a while since I made that decision but it does indeed look like I've created identical conversions. Perhaps I was duplicating an earlier effort where two instruction conversions were needed due to the input args?? Not sure. In any case, good question and I'll attempt to combine the two and hopefully it will cause not trouble. Thanks!

Ahh .. now I see why we had to do that. It has something to do with needing to pop1_with_bitcast in the code_translator. Ultimately the verifier did not like the vector sharing the definition when compiling even though the translations in instructions.rs are identical. Perhaps you know or @cfallin knows but I remember trying this now and consciously implementing a separate translation/lowering because of an issue along these lines:

`Caused by:
0: WebAssembly failed to compile
1: Compilation error: function u0:0(i64 vmctx, i64, i8x16) -> i8x16 wasmtime_system_v {
gv0 = vmctx
gv1 = load.i64 notrap aligned readonly gv0
gv2 = load.i64 notrap aligned gv1
stack_limit = gv2

block0(v0: i64, v1: i64, v2: i8x16): @004f v4 = popcnt v2 ;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ; error: inst0 (v4 = popcnt.i8x16 v2): has an invalid controlling type i8x16 @0051 jump block1(v4) block1(v3: i8x16): @0051 return v3 } ; 1 verifier error detected (see above). Compilation aborted. ', /home/jlbirch/wasmtime_jlb6740/target/debug/build/wasmtime-cli-89f47d7a9050963a/out/wast_testsuite_tests.rs:3645:18

note: run with RUST_BACKTRACE=1 environment variable to display a backtrace`

That's a good find; I think the issue is that the types allowed for popcnt are too restrictive. popcnt uses the iB type, which only includes scalar integers, but I think we should make it use the Int type, which allows both scalar and vector integers.

Ok .. Yes, makes sense. Do we want to just merged this here and refactor that in a separate patch or do it here?? My only thought is fixing a patch that currently works (should work) maybe asking for trouble .. creating some unintended consequence not realizing the restriction to 1b was there for some non-obvious reason. In any case I do suddenly have failures below to check on so I'll fix that first. Will likely continue to be very slow to respond to updates this coming week unfortunately but will get back to normal soon.

I think it's probably better to avoid adding vpopcnt in the first place rather than cleaning up in a subsequent PR -- hopefully the changes are minor on the lowering side (basically it should be just a conditional in the Opcode::Popcnt case, on the type, like the arithmetic operators have). Thanks!

abrown · 2021-05-21T17:13:51Z

cranelift/codegen/src/isa/x64/lower.rs

-                _ => unreachable!(),
-            };
+            let ty_tmp = ty.unwrap();
+            if !ty_tmp.is_vector() {


Suggested change

if !ty_tmp.is_vector() {

let ty = ty.unwrap();

if !ty.is_vector() {

abrown · 2021-05-21T17:17:36Z

cranelift/codegen/src/isa/x64/lower.rs

-            };
+            let ty_tmp = ty.unwrap();
+            if !ty_tmp.is_vector() {
+                let (ext_spec, ty) = match ctx.input_ty(insn, 0) {


@bnjbvr, what does ty mean here? (Edit: what I mean is that how ty is used here is different than elsewhere throughout this file; I'm guessing that the input type may not be the controlling type so we need to figure it out and zero-extend it in some cases. Perhaps this could be clarified with a renaming of ty to input_ty and some comment about why this is all necessary?).

abrown

Talked to @jlb6740 about this (the scalar version is unchanged despite the crazy diff); we should merge this with minor tweaks to ty and a rebase to get CI to pass. It could be that in a future PR we could simplify the scalar code a bit but that shouldn't affect this PR.

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift:wasm labels May 10, 2021

bjorn3 reviewed May 10, 2021

View reviewed changes

cranelift/codegen/src/isa/x64/lower.rs Outdated Show resolved Hide resolved

jlb6740 force-pushed the x64_implement_packed_popcnt branch 2 times, most recently from d088ff5 to 32d129f Compare May 13, 2021 20:02

abrown reviewed May 13, 2021

View reviewed changes

jlb6740 force-pushed the x64_implement_packed_popcnt branch 2 times, most recently from e06d98d to 78cfe71 Compare May 21, 2021 04:17

abrown reviewed May 21, 2021

View reviewed changes

abrown approved these changes May 21, 2021

View reviewed changes

jlb6740 force-pushed the x64_implement_packed_popcnt branch from 78cfe71 to 1156fe9 Compare May 21, 2021 17:25

Vpopcnt for x64

058f58a

jlb6740 force-pushed the x64_implement_packed_popcnt branch from 1156fe9 to 058f58a Compare May 21, 2021 21:53

jlb6740 merged commit 9a5c960 into bytecodealliance:main May 22, 2021

cfallin mentioned this pull request Dec 17, 2021

Cranelift: Miscompilation of popcnt for i8/i16 #3615

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Vpopcnt for x64 #2887

Implement Vpopcnt for x64 #2887

jlb6740 commented May 10, 2021

abrown May 13, 2021

jlb6740 May 13, 2021 •

edited

Loading

jlb6740 May 13, 2021

abrown May 13, 2021 •

edited

Loading

jlb6740 May 13, 2021 •

edited

Loading

cfallin May 14, 2021

abrown May 21, 2021

abrown May 21, 2021 •

edited

Loading

abrown left a comment •

edited

Loading

	if !ty_tmp.is_vector() {
	let ty = ty.unwrap();
	if !ty.is_vector() {

Implement Vpopcnt for x64 #2887

Implement Vpopcnt for x64 #2887

Conversation

jlb6740 commented May 10, 2021

abrown May 13, 2021

Choose a reason for hiding this comment

jlb6740 May 13, 2021 • edited Loading

Choose a reason for hiding this comment

jlb6740 May 13, 2021

Choose a reason for hiding this comment

abrown May 13, 2021 • edited Loading

Choose a reason for hiding this comment

jlb6740 May 13, 2021 • edited Loading

Choose a reason for hiding this comment

cfallin May 14, 2021

Choose a reason for hiding this comment

abrown May 21, 2021

Choose a reason for hiding this comment

abrown May 21, 2021 • edited Loading

Choose a reason for hiding this comment

abrown left a comment • edited Loading

Choose a reason for hiding this comment

jlb6740 May 13, 2021 •

edited

Loading

abrown May 13, 2021 •

edited

Loading

jlb6740 May 13, 2021 •

edited

Loading

abrown May 21, 2021 •

edited

Loading

abrown left a comment •

edited

Loading