Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vector miscompile from AArch64 backend #125989

Closed
regehr opened this issue Feb 6, 2025 · 4 comments · Fixed by #126054
Closed

vector miscompile from AArch64 backend #125989

regehr opened this issue Feb 6, 2025 · 4 comments · Fixed by #126054

Comments

@regehr
Copy link
Contributor

regehr commented Feb 6, 2025

please bear with me here, this is the best I could get out of llvm-reduce:

define i16 @f(<8 x i16> %0) {
  %2 = shufflevector <8 x i16> %0, <8 x i16> zeroinitializer, <8 x i32> <i32 4, i32 5, i32 6, i32 7, i32 9, i32 7, i32 5, i32 3>
  %3 = zext <8 x i16> %2 to <8 x i64>
  %new0 = add <8 x i64> %3, %3
  %last = trunc <8 x i64> %new0 to <8 x i16>
  %4 = extractelement <8 x i16> %last, i32 0
  ret i16 %4
}

Alive thinks (and I agree) that when this is called like this:

call i16 @f(<8 x i16> <i16 0, i16 3, i16 3, i16 3, i16 1, i16 3, i16 3, i16 3>)

the result should be 2.

but when I compile to AArch64 using current llc I get:

_f:
	movi.2d	v1, #0x00ffff0000ffff
	uzp1.4s	v0, v0, v0
	and.16b	v0, v0, v1
	add.4s	v0, v0, v0
	xtn.4h	v0, v0
	umov.h	w0, v0[0]
	ret

and this -- for that same input -- computes 0

cc @Hatsunespica @nunoplopes

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/issue-subscribers-backend-aarch64

Author: John Regehr (regehr)

please bear with me here, this is the best I could get out of llvm-reduce: ```llvm define i16 @f(<8 x i16> %0) { %2 = shufflevector <8 x i16> %0, <8 x i16> zeroinitializer, <8 x i32> <i32 4, i32 5, i32 6, i32 7, i32 9, i32 7, i32 5, i32 3> %3 = zext <8 x i16> %2 to <8 x i64> %new0 = add <8 x i64> %3, %3 %last = trunc <8 x i64> %new0 to <8 x i16> %4 = extractelement <8 x i16> %last, i32 0 ret i16 %4 } ``` Alive thinks (and I agree) that when this is called like this: ``` call i16 @f(<8 x i16> <i16 0, i16 3, i16 3, i16 3, i16 1, i16 3, i16 3, i16 3>) ``` the result should be 2.

but when I compile to AArch64 using current llc I get:

_f:
	movi.2d	v1, #<!-- -->0x00ffff0000ffff
	uzp1.4s	v0, v0, v0
	and.16b	v0, v0, v1
	add.4s	v0, v0, v0
	xtn.4h	v0, v0
	umov.h	w0, v0[0]
	ret

and this -- for that same input -- computes 0

cc @hatsunespica @nunoplopes

@regehr regehr added the NEON ARM NEON label Feb 6, 2025
@dtcxzyw dtcxzyw removed the new issue label Feb 6, 2025
@EugeneZelenko EugeneZelenko marked this as a duplicate of #125990 Feb 6, 2025
@davemgreen
Copy link
Collaborator

Thanks for the report. I think performZExtDeinterleaveShuffleCombine might be missing an extra check for undef elements.

davemgreen added a commit to davemgreen/llvm-project that referenced this issue Feb 6, 2025
Given a zext from an extract vector, with a shuffle mask like <4, 0, 0, 4> we
would previously recognize the top half as a deinterleave. In order to convert
into a uzp we should have been checking that the bottom half is also undef.

Fixes llvm#125989
davemgreen added a commit to davemgreen/llvm-project that referenced this issue Feb 6, 2025
Given a zext from an extract vector, with a shuffle mask like <4, 0, 0, 4> we
would previously recognize the top half as a deinterleave. In order to convert
into a uzp we should have been checking that the bottom half is also undef.

Fixes llvm#125989
@davemgreen davemgreen added this to the LLVM 20.X Release milestone Feb 7, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Feb 7, 2025
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status Feb 7, 2025
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status Feb 7, 2025
@davemgreen
Copy link
Collaborator

/cherry-pick 2c43479

@llvmbot
Copy link
Member

llvmbot commented Feb 7, 2025

/pull-request #126263

swift-ci pushed a commit to swiftlang/llvm-project that referenced this issue Feb 10, 2025
…ine (llvm#126054)

Given a zext from an extract vector, with a shuffle mask like <4, 0, 0, 4> we
would previously recognize the top half as a deinterleave. In order to convert
into a uzp we should have been checking that the bottom half is also poison.

Fixes llvm#125989

(cherry picked from commit 2c43479)
Icohedron pushed a commit to Icohedron/llvm-project that referenced this issue Feb 11, 2025
…ine (llvm#126054)

Given a zext from an extract vector, with a shuffle mask like <4, 0, 0, 4> we
would previously recognize the top half as a deinterleave. In order to convert
into a uzp we should have been checking that the bottom half is also poison.

Fixes llvm#125989
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

5 participants