-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
any() on boolean vectors on 32-bit ARM likely broken #12
Comments
The implementation seems to assume that it's OK to transmute a 128-bit vector into a pair of 64-bit vectors (the 128-bit registers are aliased with two 64-bit registers). This is not what clang's |
To use an aliased half-register,
AFAICT, to fix this, a compiler RFC to extend SIMD shuffles so that the parameter and return value lane number doesn't need to be the same is needed. I'm thinking adding |
@hsivonen I suspect you don't need an RFC for that. Instead, you can probably just submit a PR. The raw shuffle intrinsics are unstable today and probably will be for the foreseeable future. (So long as the spectre of integer generics looms, I suspect that will be true.) I think the quickest way to stabilization is to provide a layer above the shuffle in |
OK. I'll try to go with the direct rustc PR route. |
I was wrong. rustc already supports N to M shuffles. The number is the shuffle name is the output lane count and does not limit the input lanes. |
Use shuffles instead of transmutes for accessing aliased half-registers. Implement the `Simd` trait for more types in order to satisfy the trait bounds of the shuffle intrinsic declarations. Bitcast to `u32x2` before extracting data to an ALU register. In the `all()` case, compare with `0xFFFFFFFF` instead of zero. Closes #12.
Steps to reproduce
git clone https://github.com/hsivonen/encoding_rs
cd encoding_rs
git checkout 3049251cd80bb8eebc7d8c96057480d4e84fffef
RUSTFLAGS=' -C target-feature=+neon' cargo test --features simd-accel
Expected results
Expected tests to pass, since
encoding_rs
contains no 32-bit ARM-specific code and the same code that only usessimd
-crate facilities and cross-architecture LLVM shuffles works on Aarch64.Actual results
Various tests fail. Since it's unlikely that LLVM is broken and unlikely that the rustc-to-LLVM part is broken just for 32-bit ARM, I suspect that the implementation for
any()
on boolean vectors is broken.The text was updated successfully, but these errors were encountered: