-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Windows does not define __ARM_NEON #363
Conversation
However, Windows requires NEON to work, so assume it is there if build with MSVC
include/cpu_features_macros.h
Outdated
@@ -235,7 +235,7 @@ | |||
#endif // defined(CPU_FEATURES_ARCH_X86) | |||
|
|||
#if defined(CPU_FEATURES_ARCH_ANY_ARM) | |||
#if defined(__ARM_NEON) | |||
#if defined(__ARM_NEON) || defined(_MSC_VER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few notes about arm versions and Windows support:
Armv8-a and higher
NEON/AdvSIMD is MANDATORY
for armv8-a and higher:
FP/SIMD must be implemented on all Armv8.0 implementations, but implementations targeting specialized markets may support the following combinations:
- No NEON or floating-point.
- Full floating-point and SIMD support with exception trapping.
- Full floating-point and SIMD support without exception trapping.
Armv7
According to ARM Architecture Reference Manual ARMv7-A and ARMv7-R A1.4.1 Instruction set architecture extensions:
Advanced SIMDv1:
- It is an
OPTIONAL
extension to the ARMv7-A and ARMv7-R profiles
Hence, Arm CPU may not support NEON instructions for Armv7.
The likely target platform that can be used for Windows Arm 32-bit is UWP. (
Universal Windows Platform). However, Microsoft announced that Arm32 UWP is deprecated and will be removed
Windows devices running on an Arm processor (for example, Snapdragon processors from Qualcomm) will no longer support AArch32 (Arm32). This change impacts Universal Windows Platform apps that presently target AArch32 (Arm32). Support for 32-bit Arm versions of applications will be removed in a future release of Windows 11.. System binaries for ARM32 support (present in the sysarm32 folder) will also be removed. After this change, for the small number of applications affected, app features might be different and you might notice a difference in performance. Therefore, we recommend updating your targeted platforms to AArch64 (Arm64), which is supported on all Windows on Arm devices, as soon as possible in order to ensure your customers can continue to enjoy the best possible experience. Follow the guidance on this page to update your applications to AArch64 (Arm64).
ref:
https://www.microsoft.com/en-us/windows/windows-11-specifications?r=1#table3
https://learn.microsoft.com/en-us/windows/arm/arm32-to-arm64
Therefore, there is unlikely to be a case of using an AARCH32 state application on Windows Armv8+.
Also, please note that we do not have support Windows ARM, only ARM64.
So, I think it is safe to add this patch.
@gchatelet, @Mizux, any objections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! Thanks for the positive review. It is a good point that NEON is technically not required on armv7, from an architecture perspective.
From my research, though, Windows on ARM does seem to require NEON to be present on armv7 despite the architecture's requirements.
- The documentation at https://learn.microsoft.com/en-us/cpp/build/overview-of-arm-abi-conventions?view=msvc-170 is a bit vague, but seems to indicate a requirement
- More directly, Raymond Chen's article at https://devblogs.microsoft.com/oldnewthing/20210531-00/?p=105265 strongly indicates that NEON is always required with Windows for ARM32.
- For ARM64, the wording at https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170 also strongly indicates that NEON must be present.
Just my two cents on why I was comfortable with this code change. I welcome other thoughts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the rationale of the fix.
Can we rewrite it like the following? I think it would be clearer
// Note: MSVC targeting ARM does not define `__ARM_NEON` but Windows on ARM requires it.
// In that case we force NEON detection.
#if defined(__ARM_NEON) || (DEFINED(CPU_FEATURES_COMPILER_MSC) && defined(CPU_FEATURES_ARCH_ANY_ARM))
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 1
#else
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 0
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any problem with this change; I changed it
include/cpu_features_macros.h
Outdated
@@ -235,7 +235,7 @@ | |||
#endif // defined(CPU_FEATURES_ARCH_X86) | |||
|
|||
#if defined(CPU_FEATURES_ARCH_ANY_ARM) | |||
#if defined(__ARM_NEON) | |||
#if defined(__ARM_NEON) || defined(_MSC_VER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the rationale of the fix.
Can we rewrite it like the following? I think it would be clearer
// Note: MSVC targeting ARM does not define `__ARM_NEON` but Windows on ARM requires it.
// In that case we force NEON detection.
#if defined(__ARM_NEON) || (DEFINED(CPU_FEATURES_COMPILER_MSC) && defined(CPU_FEATURES_ARCH_ANY_ARM))
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 1
#else
#define CPU_FEATURES_COMPILED_ANY_ARM_NEON 0
#endif
@JVital2013 do you mind |
@gchatelet I fixed the formatting, sorry about that! |
This is a fix for #363 . The preprocessor check must be done in two steps.
This is a fix for #363 . The preprocessor check must be done in two steps.
On MSVC for ARM/ARM64, __ARM_NEON is not defined. However, Windows requires NEON on ARM, so it should be safe to assume that if CPU_FEATURES_ARCH_ANY_ARM is defined and we're in MSVC, NEON is also available.
There may be a better way to do this, but this patch has helped get my project running at full speed on Windows ARM64 machines and wanted to share. Thanks for this great library!