-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aarch64-unknown-none-softfloat: ABI unsoundness when enabling "neon" feature #134375
Comments
I also skimmed the git history for the https://github.com/rust-lang/rust/blob/master/src/doc/rustc/src/platform-support.md but found nothing. I think it could make sense to demote the target if it's unmaintained. @RalfJung should I mention this question in the next compiler meeting? Another question: do you think the thanks EDIT: I feel this comment kind of answers my second question: IIUC the issue is only when the |
<#131058> was about various issues with the neon target feature; except for this one, they should all be resolved.
|
Cc @JamieCunliffe (currently, arm64 in the kernel uses |
Should we be using |
If you never need Probably, someone will have to add a |
@Darksonn @ojeda does Rust4Linux need to locally enable Right now the only way I can see to fix this issue is to just mark the |
I don't think the existing C users will be rewritten anytime soon, but who knows what out-of-tree may be doing. Let's see what Jamie says, and I have pinged the arm64 maintainers (Cc @ctmarinas). Alice et al. (Cc @maurer @samitolvanen) can confirm for Android. Also Cc @asahilina for Asahi and @Fabo in case they need it. If "being careful" (differently-compiled TUs) would still be doable like in C, then I guess it should be fine. |
I only use floats in const context (to transmute f32 to u32) and don't enable neon or anything like that. I didn't even realize Rust had an aarch64 softfloat target, I ended up implementing my own softfloat in pure integer Rust instead where I needed float math.
|
Android or Panthor does not currently need floats in kernel Rust code. |
#135160 proposes to make |
It is inevitable that some (perhaps third party / out of tree) Rust code will need to use NEON within the Linux kernel, in other kernels, and in similar situations. It is just a matter of time. In the same way that people are replacing C code with Rust code, people are also trying to replace out-of-line assembly with smaller bits of inline assembly and/or with auto-vectorized code, using patterns like this:
For people writing code like the above, we'd need to change our So, I don't think that disabling |
Did you intend the double negation? This sentence says you do think that it is a long-term solution... To be clear, I don't think this is a long-term solution. But it also makes no sense to pretend to support something which our primary backend does not support. Someone has to put in the work on the LLVM side before we can provide this feature (using the FPU on an otherwise softfloat aarch64 target) to our users. |
Obviously not. I edited it.
We agree on that. |
The example is, in part:
Is there an example that doesn't use floating point types within a If not, it would be better to "just" add a check (maybe even just a lint) that rejects any use of floating point types within |
I am not confident enough in our ability to judge when LLVM will insert function calls whose ABI might be affected, to be comfortable with that approach. |
I think we could improve our comfort there. It seems like the issue with intrinsics is that either LLVM is making assumptions about intrinsics that aren't valid, and/or Rust is using intrinsics in a way that isn't compatible with how they are designed in LLVM. It seems like rustc is already able to handle cross-ABI function calls--either it has machinery for an enable-NEON function can call a non-enable-NEON function, or it has machinery to reject those calls--but that machinery isn't being invoked for calls to LLVM intrinsics. Or maybe the machinery is being invoked if the call to the intrinsics is done in actual Rust code but not if the call to the intrinsic is generated by rustc? If that is the case, then that seems like something to be improved, and naively it seems like it would be practical to make that improvement. |
In particular, in the actual (C) runtime library, there is object code for each called function. That means there is a function prototype for that function within LLVM/libc that describes its interface. But that declaration apparently doesn't have an adequate description of the interface, and/or internally LLVM doesn't have adequate checking of the ABI compatibility when inserting the calls. Also, there are very few (non-inlined) functions in the runtime library (object code) that take SIMD and/or floating point types as arguments and/or return them. And very few places that insert calls to those few functions. So I do think it is quite a tractable problem to solve even through brute force. |
This seems to be exactly what clang does for C code when using
It accepts |
The issue is that LLVM does not have a concept of different ABIs for aarch64. It does some ad hoc fallback when certain target features are missing but there is nothing systematic. This is in contrast to arm-32 and riscv where LLVM has explicit ABI knobs that let us request softfloat abi, and that work reliably no matter the target features. x86 at least has a target feature that lets us force the soft-float ABI no matter what else is available; that is far from great but it's something. Aarch64 has nothing. This cannot be fixed without some LLVM work.
Aside from this being fragile re: incomplete knowledge about the ways in which LLVM likes to screw us over, Rust also has historically rejected the idea of making floats or float-ops only available in a target-dependent way. |
But why should LLVM ever add support for it? Basically, we're asking LLVM to add support for calling these intrinsic-backing functions for floating point operations that nobody actually wants to use, so basically add a bunch of dead code to LLVM, that will used primarily by the Rust toolchain test suite and nothing else. That doesn't make sense. In other words, I don't think there's anything wrong with LLVM's approach, so I don't see why they would change it. Is there actually any useful code that uses floating point in an aarch64-unknown-none target, at all? Is there any existing floating point code in the Linux kernel, at all that would hint at the need for floating point support in Rust for the Linux kernel or any OS kernel? AFAICT, no. Conversely, there's tons of NEON code in the Linux kernel and other ARM/ARM64 kernels, currently written in out-of-line assembly, unfortunately.
Sure, but that's a rustc problem, not an LLVM problem. And even if LLVM fixed it, it wouldn't be fixed for non-LLVM backends. Instead of changing every backend, I think it makes more sense to change the interface between rustc and the backend so that rustc can divert its float support to a library for these targets. Even if this hurts the optimization of the floating point operations when the diversion is done, it will be OK, because ~nobody is actually going to be using it. |
To clarify, there is a lot of explicit NEON-using code written in C in the Linux kernel, as well. |
If LLVM doesn't intend to support aarch64-softfloat, we should remove the target entirely and instead add aarch64-nofloat -- but that will require an RFC that explains how we handle targets without float support. In the end what happened here is that the aarch64-softfloat target was moved to tier 2 prematurely, without sufficient consideration for what we can actually support soundly given the current state of our backend, and now we have to clean up that mess somehow. Such an accident should not replace the proper process we usually use for changing Rust. That said, I think support on the LLVM side is not very hard and does not require a lot of code. All it takes is for the logic that determines which registers to use for a function call to first check |
FWIW, I started looking into this and doing the obvious thing mostly works, but I ran into issues with legalization of f128 libcalls. The problem is basically that for AArch64 f128 is a legal type, but i128 is not, so libcall legalization would have to directly go from a pair of i64 to f128 via stack load+store I think. Dealing with that kind of issue will be very annoying :( |
What does it currently do when you try to pass an f128 with |
The |
So even |
What makes a type legal - is it ABI specification, or the ability to be passed as a single register? |
Enabling the "neon" target feature on the aarch64-unknown-none-softfloat target is taken by LLVM as a sign that we want to use the hardfloat ABI. That is unfortunate as it makes it UB to link such code against code built for aarch64-unknown-none-softfloat without the "neon" target feature. (Note that it's not "neon" which is problematic but "fp-armv8"; however, the two are tied together by rustc.)
For Rust-generated functions we work around this by forcing our own ABI, passing floats either indirectly or via integer registers (#133102). However, this does not help for LLVM-generated calls for builtins/intrinsics, as shown in this example by @beetrees.
We don't have target maintainers listed for this target, so maybe that means we can just demote it to tier 3? (See #113739)
LLVM issue: llvm/llvm-project#110632. So far LLVM maintainers seem to not agree that there is a problem here.
@Amanieu unfortunately your proposal for making floats work on that target doesn't quite suffice. :/ We do need some help from LLVM.
@nikic do you have any good ideas for what we could do here? It seems like rejecting enabling "neon" on aarch64-unknown-none-softfloat is the only sound option we have right now, but I worry that may make the Rust-for-Linux folks (among others) unhappy.
The text was updated successfully, but these errors were encountered: