Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64-unknown-none-softfloat: ABI unsoundness when enabling "neon" feature #134375

Open
RalfJung opened this issue Dec 16, 2024 · 28 comments · May be fixed by #135160
Open

aarch64-unknown-none-softfloat: ABI unsoundness when enabling "neon" feature #134375

RalfJung opened this issue Dec 16, 2024 · 28 comments · May be fixed by #135160
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. I-prioritize Issue: Indicates that prioritization has been requested for this issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-AArch64 Armv8-A or later processors in AArch64 mode T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Dec 16, 2024

Enabling the "neon" target feature on the aarch64-unknown-none-softfloat target is taken by LLVM as a sign that we want to use the hardfloat ABI. That is unfortunate as it makes it UB to link such code against code built for aarch64-unknown-none-softfloat without the "neon" target feature. (Note that it's not "neon" which is problematic but "fp-armv8"; however, the two are tied together by rustc.)

For Rust-generated functions we work around this by forcing our own ABI, passing floats either indirectly or via integer registers (#133102). However, this does not help for LLVM-generated calls for builtins/intrinsics, as shown in this example by @beetrees.

We don't have target maintainers listed for this target, so maybe that means we can just demote it to tier 3? (See #113739)

LLVM issue: llvm/llvm-project#110632. So far LLVM maintainers seem to not agree that there is a problem here.

@Amanieu unfortunately your proposal for making floats work on that target doesn't quite suffice. :/ We do need some help from LLVM.
@nikic do you have any good ideas for what we could do here? It seems like rejecting enabling "neon" on aarch64-unknown-none-softfloat is the only sound option we have right now, but I worry that may make the Rust-for-Linux folks (among others) unhappy.

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Dec 16, 2024
@RalfJung RalfJung added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-AArch64 Armv8-A or later processors in AArch64 mode labels Dec 16, 2024
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Dec 16, 2024
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-ABI Area: Concerning the application binary interface (ABI) A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Dec 16, 2024
@apiraino
Copy link
Contributor

apiraino commented Dec 16, 2024

We don't have target maintainers listed for this target, so maybe that means we can just demote it to tier 3?

I also skimmed the git history for the https://github.com/rust-lang/rust/blob/master/src/doc/rustc/src/platform-support.md but found nothing. I think it could make sense to demote the target if it's unmaintained. @RalfJung should I mention this question in the next compiler meeting?

Another question: do you think the neon target feature could create undefined behaviours also in other targets? (I don't know how they work)

thanks

EDIT: I feel this comment kind of answers my second question: IIUC the issue is only when the neon target feature is used on compile target with no FPU (softfloats), which by looking at this list it's basically just the one mentioned here.

@RalfJung
Copy link
Member Author

RalfJung commented Dec 16, 2024 via email

@RalfJung
Copy link
Member Author

With #134794 we could fairly easily reject enabling neon specifically on this target. That would fix the ABI issues.

I just wonder, is anyone relying on enabling neon for aarch64-unknown-none-softfloat? Not quite sure whom to ping... @Darksonn for R4L, who else might be using this target?

@ojeda
Copy link
Contributor

ojeda commented Dec 27, 2024

Cc @JamieCunliffe (currently, arm64 in the kernel uses --target=aarch64-unknown-none -Ctarget-feature="-neon").

@Darksonn
Copy link
Contributor

Should we be using --target=aarch64-unknown-none-softfloat instead?

@RalfJung
Copy link
Member Author

If you never need neon, that would be slightly better... but there's still no sound way to enable neon (or fp-armv8) for some of the code. The limiting factor here is LLVM (llvm/llvm-project#110632). The C ecosystem assumes that there is a hard-error when floats are used somewhere that neon is not available, so there's no proper support for a softloat ABI. However Rust assumes float types always exist and there's some reasonable ABI for them.

Probably, someone will have to add a soft-float target feature to LLVM that forces the use of integer registers for passing floats even when fp-armv8 is available.

@RalfJung
Copy link
Member Author

@Darksonn @ojeda does Rust4Linux need to locally enable neon for some parts of the build? Or is that all going to stay on the C side for the foreseeable future?

Right now the only way I can see to fix this issue is to just mark the neon target feature as incompatible with aarch64-unknown-none-softfloat. Some work on the LLVM side is needed to fix that in a way that is compatible with Rust (where we want to always provide the float types and just use softfloats when needed). That said, the C approach of just using different compiler flags for different parts of the code and "being careful" when linking (making sure there are no float types crossing that ABI boundary) would also work in Rust.

@ojeda
Copy link
Contributor

ojeda commented Dec 31, 2024

I don't think the existing C users will be rewritten anytime soon, but who knows what out-of-tree may be doing. Let's see what Jamie says, and I have pinged the arm64 maintainers (Cc @ctmarinas). Alice et al. (Cc @maurer @samitolvanen) can confirm for Android. Also Cc @asahilina for Asahi and @Fabo in case they need it.

If "being careful" (differently-compiled TUs) would still be doable like in C, then I guess it should be fine.

@asahilina
Copy link

asahilina commented Dec 31, 2024 via email

@Darksonn
Copy link
Contributor

Darksonn commented Jan 6, 2025

Android or Panthor does not currently need floats in kernel Rust code.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 6, 2025

#135160 proposes to make #[target_feature(enable = "neon")] a hard error on aarch64-unknown-none-softfloat; please chime in if that would be a problem for anyone.

@briansmith
Copy link
Contributor

briansmith commented Jan 8, 2025

I don't think the existing C users will be rewritten anytime soon, but who knows what out-of-tree may be doing.

It is inevitable that some (perhaps third party / out of tree) Rust code will need to use NEON within the Linux kernel, in other kernels, and in similar situations. It is just a matter of time.

In the same way that people are replacing C code with Rust code, people are also trying to replace out-of-line assembly with smaller bits of inline assembly and/or with auto-vectorized code, using patterns like this:

fn foo(safe_to_use_vectorized: bool) {
    // TODO(MSRV): "neon" isn't supported by Rust for target_arch = "arm" yet.
    #[cfg(any(target_arch = "x86", target_arch = "x86_64", target_arch = "aarch64"))]
    if safe_to_use_vectorized {
        #[cfg_attr(any(target_arch = "aarch64"), target_feature(enable="neon"))]
        #[cfg_attr(any(target_arch = "x86", target_arch = "x86_64"), target_feature(enable="sse2"))]
         unsafe fn foo_vectorized() { foo_inner() } // auto-vectorized foo_inner    
     
        return unsafe { foo_vectorized() };
    }
    
    let _ = safe_to_use_vectorized;
    foo_inner()
}

#[inline(always)] // So it can be auto-vectorized within `foo_vectorized()`
fn foo_inner() {
    // auto-vectorizable code goes here.
}

For people writing code like the above, we'd need to change our target_arch = "aarch64" conditions to something so that it can build for aarch64-unknown-none-softfloat. I.e. What is the cfg() expression for 'an aarch64 target that supports target_feature(enable="neon")'?

So, I don't think that disabling target_feature(enable="neon") for these targets [EDIT: typo] can be a long-terms solution to the ABI issues.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 8, 2025

What is the cfg() expression for 'an aarch64 target that supports target_feature(enable="neon")'?

all(target_arch="aarch64", not(target_abi="softfloat"))

So, I don't think that disabling target_feature(enable="neon") for these targets isn't a long-terms solution to the ABI issues.

Did you intend the double negation? This sentence says you do think that it is a long-term solution...

To be clear, I don't think this is a long-term solution. But it also makes no sense to pretend to support something which our primary backend does not support. Someone has to put in the work on the LLVM side before we can provide this feature (using the FPU on an otherwise softfloat aarch64 target) to our users.

@briansmith
Copy link
Contributor

Did you intend the double negation? This sentence says you do think that it is a long-term solution...

Obviously not. I edited it.

To be clear, I don't think this is a long-term solution. But it also makes no sense to pretend to support something

We agree on that.

@briansmith
Copy link
Contributor

briansmith commented Jan 8, 2025

However, this does not help for LLVM-generated calls for builtins/intrinsics, as shown in this example by @beetrees.

The example is, in part:

#[inline(never)]
#[target_feature(enable = "neon")]
pub unsafe fn yes_neon(x: u128) -> f64 {
    x as f64
}

Is there an example that doesn't use floating point types within a #[target_feature(enable = "neon")] function?

If not, it would be better to "just" add a check (maybe even just a lint) that rejects any use of floating point types within #[target_feature(enable = "neon")] functions, when the target doesn't enable NEON by default, instead of rejecting #[target_feature(enable = "neon")] completely. This would then allow everything to work, with probably no disruption.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 8, 2025

I am not confident enough in our ability to judge when LLVM will insert function calls whose ABI might be affected, to be comfortable with that approach.

@briansmith
Copy link
Contributor

briansmith commented Jan 8, 2025

I am not confident enough in our ability to judge when LLVM will insert function calls whose ABI might be affected, to be comfortable with that approach.

I think we could improve our comfort there.

It seems like the issue with intrinsics is that either LLVM is making assumptions about intrinsics that aren't valid, and/or Rust is using intrinsics in a way that isn't compatible with how they are designed in LLVM.

It seems like rustc is already able to handle cross-ABI function calls--either it has machinery for an enable-NEON function can call a non-enable-NEON function, or it has machinery to reject those calls--but that machinery isn't being invoked for calls to LLVM intrinsics. Or maybe the machinery is being invoked if the call to the intrinsics is done in actual Rust code but not if the call to the intrinsic is generated by rustc? If that is the case, then that seems like something to be improved, and naively it seems like it would be practical to make that improvement.

@briansmith
Copy link
Contributor

In particular, in the actual (C) runtime library, there is object code for each called function. That means there is a function prototype for that function within LLVM/libc that describes its interface. But that declaration apparently doesn't have an adequate description of the interface, and/or internally LLVM doesn't have adequate checking of the ABI compatibility when inserting the calls.

Also, there are very few (non-inlined) functions in the runtime library (object code) that take SIMD and/or floating point types as arguments and/or return them. And very few places that insert calls to those few functions. So I do think it is quite a tractable problem to solve even through brute force.

@briansmith
Copy link
Contributor

briansmith commented Jan 8, 2025

If not, it would be better to "just" add a check (maybe even just a lint) that rejects any use of floating point types within #[target_feature(enable = "neon")] functions, when the target doesn't enable NEON by default, instead of rejecting #[target_feature(enable = "neon")] completely. This would then allow everything to work, with probably no disruption.

This seems to be exactly what clang does for C code when using -march=armv8+nofp and/or -march=armv8+nofp+nosimd ; see Godbolt.

__attribute__((noinline)) 
__attribute__((target("neon")))
__uint128_t neon_no_float(__uint128_t x) {
    return x;
}

__attribute__((noinline)) 
float no_neon_float(__uint128_t x) {
    return (float)x;
}

__attribute__((noinline)) 
__attribute__((target("neon")))
float neon_float(__uint128_t x) {
    return (float)x;
}

It accepts neon_no_float even though it enables NEON, because it doesn't use floating point types. It rejects no_neon_float even though it doesn't use NEON, because it uses float. It rejects neon_float because it uses float.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 8, 2025

The issue is that LLVM does not have a concept of different ABIs for aarch64. It does some ad hoc fallback when certain target features are missing but there is nothing systematic.

This is in contrast to arm-32 and riscv where LLVM has explicit ABI knobs that let us request softfloat abi, and that work reliably no matter the target features. x86 at least has a target feature that lets us force the soft-float ABI no matter what else is available; that is far from great but it's something. Aarch64 has nothing.

This cannot be fixed without some LLVM work.

If not, it would be better to "just" add a check (maybe even just a lint) that rejects any use of floating point types within #[target_feature(enable = "neon")] functions, when the target doesn't enable NEON by default, instead of rejecting #[target_feature(enable = "neon")] completely. This would then allow everything to work, with probably no disruption.

Aside from this being fragile re: incomplete knowledge about the ways in which LLVM likes to screw us over, Rust also has historically rejected the idea of making floats or float-ops only available in a target-dependent way.

@briansmith
Copy link
Contributor

Aarch64 has nothing. This cannot be fixed without some LLVM work.

But why should LLVM ever add support for it? Basically, we're asking LLVM to add support for calling these intrinsic-backing functions for floating point operations that nobody actually wants to use, so basically add a bunch of dead code to LLVM, that will used primarily by the Rust toolchain test suite and nothing else. That doesn't make sense. In other words, I don't think there's anything wrong with LLVM's approach, so I don't see why they would change it.

Is there actually any useful code that uses floating point in an aarch64-unknown-none target, at all? Is there any existing floating point code in the Linux kernel, at all that would hint at the need for floating point support in Rust for the Linux kernel or any OS kernel? AFAICT, no.

Conversely, there's tons of NEON code in the Linux kernel and other ARM/ARM64 kernels, currently written in out-of-line assembly, unfortunately.

Rust also has historically rejected the idea of making floats or float-ops only available in a target-dependent way.

Sure, but that's a rustc problem, not an LLVM problem. And even if LLVM fixed it, it wouldn't be fixed for non-LLVM backends. Instead of changing every backend, I think it makes more sense to change the interface between rustc and the backend so that rustc can divert its float support to a library for these targets. Even if this hurts the optimization of the floating point operations when the diversion is done, it will be OK, because ~nobody is actually going to be using it.

@briansmith
Copy link
Contributor

Conversely, there's tons of NEON code in the Linux kernel and other ARM/ARM64 kernels, currently written in out-of-line assembly, unfortunately.

To clarify, there is a lot of explicit NEON-using code written in C in the Linux kernel, as well.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 9, 2025

If LLVM doesn't intend to support aarch64-softfloat, we should remove the target entirely and instead add aarch64-nofloat -- but that will require an RFC that explains how we handle targets without float support. In the end what happened here is that the aarch64-softfloat target was moved to tier 2 prematurely, without sufficient consideration for what we can actually support soundly given the current state of our backend, and now we have to clean up that mess somehow. Such an accident should not replace the proper process we usually use for changing Rust.

That said, I think support on the LLVM side is not very hard and does not require a lot of code. All it takes is for the logic that determines which registers to use for a function call to first check FloatABIType before checking target features, and to force the use of integer registers if FloatABIType is set to Soft. The ARM-32 backend already does that. It should be a fairly simple change, for someone sufficiently knowledgeable in the LLVM codebase. There have been recent discussions on Zulip where @nikic indicated it may be a good idea to adjust all architecture backends to work that way, so we have some support for this plan on the LLVM side as well.

@nikic
Copy link
Contributor

nikic commented Jan 9, 2025

FWIW, I started looking into this and doing the obvious thing mostly works, but I ran into issues with legalization of f128 libcalls. The problem is basically that for AArch64 f128 is a legal type, but i128 is not, so libcall legalization would have to directly go from a pair of i64 to f128 via stack load+store I think. Dealing with that kind of issue will be very annoying :(

@RalfJung
Copy link
Member Author

RalfJung commented Jan 9, 2025

What does it currently do when you try to pass an f128 with -neon,-fp-armv8?

@nikic
Copy link
Contributor

nikic commented Jan 9, 2025

The -fp-armv8 case is easy because f128 is not a legal type, so everything can stay in integers.

@RalfJung
Copy link
Member Author

RalfJung commented Jan 9, 2025

So even fadd or fptoui just does not work on f128 with -fp-armv8? It doesn't call some softfloat routines?

@tgross35
Copy link
Contributor

tgross35 commented Jan 9, 2025

for AArch64 f128 is a legal type, but i128 is not

What makes a type legal - is it ABI specification, or the ability to be passed as a single register?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ABI Area: Concerning the application binary interface (ABI) A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. I-prioritize Issue: Indicates that prioritization has been requested for this issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-AArch64 Armv8-A or later processors in AArch64 mode T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants