-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AArch64: LLVM auto-vectorization of memcpy causes alignment faults with 8-byte aligned addresses #22491
Comments
i fixed this in my own project by overwriting memcpy and importing early in boot.s both files work, but I need to actually track performance metrics (compiler vs memcpy_overwrite vs memcpy_overwrite_neon) memcpy_overwrite.s
memcpy_overwrite_neon.s
|
If you'd like, you could test/use the memcpy implementation in #18912. I would think that LLVM shouldn't mess alignment in the implementation up as it does check pointer alignments. If you do try it and encounter any issues let me know in that PR. |
@dweiller I think the next step here is to fix the issues @jacobly0 pointed out in review comments on that PR. Then we could ask @Hotschmoe to confirm the issue is resolved. |
@Hotschmoe Are you able to confirm if this issue has been fixed with a compiler version |
see it still fails using zig-windows-x86_64-0.14.0-dev.3008+7cef585f5 though I may be testing this incorrectly, I do not consider myself well versed in this low-level development
|
Hmm, okay - I looked at the assembly for target aarch64-linux and it didn't look like there should be unaligned |
You can try to experiment with I don't have time to open the Arm64 manual right now to verify, but since you're apparently doing freestanding development, it's entirely thinkable that the processor starts out with strict alignment checks enabled, while the compiler generally defaults to assuming non-strict alignment for code generation purposes. In such cases, it's on the user to enable the |
Okay it looks like it was my fault. I was "Enabling alignment checking in SCTLR_EL1" But it turns out other mobile armv8 operating systems like Android do not enable alignment checking. With that CPU feature disabled I have no issues and no alignment faults in my original project. |
Zig Version
0.13.0
Steps to Reproduce and Observed Behavior
The Bug
While the Zig compiler_rt implementation in
compiler_rt/memcpy.zig
is a simple byte-by-byte copy, LLVM's auto-vectorization transforms this into SIMD instructions (ldp/stp with Q registers) for copies larger than 32 bytes (0x20). These instructions require 16-byte alignment, but no alignment verification is performed before the transformation.Technical Details
ldp q0, q1, [x9, #-16]
which requires 16-byte alignment (attempting to load 32 bytes using two 128-bit SIMD registers)Impact
Reproduction
https://github.com/Hotschmoe/zigv13_memcpy_failure
The included code demonstrates the issue by:
Running the Test
(in repo you can run
zig build run
to see the fault)Additional Notes
see repo https://github.com/Hotschmoe/zigv13_memcpy_failure for additional notes about the failure in my own project. Feel free to reach out to access the files if needed
Environment
Expected Behavior
Expected Behavior
The implementation should either:
Actual Behavior
LLVM's auto-vectorization transforms the byte-by-byte implementation into SIMD instructions without alignment checks, causing alignment faults when buffers aren't 16-byte aligned.
The text was updated successfully, but these errors were encountered: