Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vector miscompile from AArch64 backend #125990

Closed
regehr opened this issue Feb 6, 2025 · 1 comment
Closed

vector miscompile from AArch64 backend #125990

regehr opened this issue Feb 6, 2025 · 1 comment
Labels

Comments

@regehr
Copy link
Contributor

regehr commented Feb 6, 2025

please bear with me here, this is the best I could get out of llvm-reduce:

define i16 @f(<8 x i16> %0) {
  %2 = shufflevector <8 x i16> %0, <8 x i16> zeroinitializer, <8 x i32> <i32 4, i32 5, i32 6, i32 7, i32 9, i32 7, i32 5, i32 3>
  %3 = zext <8 x i16> %2 to <8 x i64>
  %new0 = add <8 x i64> %3, %3
  %last = trunc <8 x i64> %new0 to <8 x i16>
  %4 = extractelement <8 x i16> %last, i32 0
  ret i16 %4
}

Alive thinks (and I agree) that when this is called like this:

call i16 @f(<8 x i16> <i16 0, i16 3, i16 3, i16 3, i16 1, i16 3, i16 3, i16 3>)

the result should be 2.

but when I compile to AArch64 using current llc I get:

_f:
	movi.2d	v1, #0x00ffff0000ffff
	uzp1.4s	v0, v0, v0
	and.16b	v0, v0, v1
	add.4s	v0, v0, v0
	xtn.4h	v0, v0
	umov.h	w0, v0[0]
	ret

and this -- for that same input -- computes 0

cc @Hatsunespica @nunoplopes

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/issue-subscribers-backend-aarch64

Author: John Regehr (regehr)

please bear with me here, this is the best I could get out of llvm-reduce: ```llvm define i16 @f(<8 x i16> %0) { %2 = shufflevector <8 x i16> %0, <8 x i16> zeroinitializer, <8 x i32> <i32 4, i32 5, i32 6, i32 7, i32 9, i32 7, i32 5, i32 3> %3 = zext <8 x i16> %2 to <8 x i64> %new0 = add <8 x i64> %3, %3 %last = trunc <8 x i64> %new0 to <8 x i16> %4 = extractelement <8 x i16> %last, i32 0 ret i16 %4 } ``` Alive thinks (and I agree) that when this is called like this: ``` call i16 @f(<8 x i16> <i16 0, i16 3, i16 3, i16 3, i16 1, i16 3, i16 3, i16 3>) ``` the result should be 2.

but when I compile to AArch64 using current llc I get:

_f:
	movi.2d	v1, #<!-- -->0x00ffff0000ffff
	uzp1.4s	v0, v0, v0
	and.16b	v0, v0, v1
	add.4s	v0, v0, v0
	xtn.4h	v0, v0
	umov.h	w0, v0[0]
	ret

and this -- for that same input -- computes 0

cc @hatsunespica @nunoplopes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants