-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Test failures with 1.26.0rc1 in conda-forge #24660
Comments
Full error log linux-aarch
Full error log pypy on windows
|
Which openblas version are you using? The cholesky error is most likely due to blas. The only changes in |
I put the details in the issue text (under
Meson on all platforms.
Yeah, 0.3.24 came out not too long ago, I can retest with 0.3.23. |
I think the previous 1.26.0b1 may have been using lapack-lite for nmpy.linalg by mistake instead of OpenBLAS |
Can confirm that aarch passes with openblas 0.3.23. @martin-frbg, this looks like a regression in 0.3.24. Do you want me to open an issue for OpenBLAS or perhaps you have a possible culprit in mind already based on the error in the cholesky test? |
Which flavor of aarch64 is this ? The only very recent change that made it into 0.3.24 was the ROTG rewrite to use the new "safe" scaling algorithm from the Reference BLAS, most everything else was specific to NeoverseV1. |
Since this is running in emulation through QEMU, I'm guessing it's the "most vanilla possible"? Numpy has some CPU feature detection on aarch as well, but doesn't find any CPU features (printed before the test suite run).
|
That would be the ARMV8 target, no changes there and nothing on the obvious Cholesky codepath (potrf &friends) |
I wonder what the difference is between this aarch64 test run and the one in the github action on the numpy/numpy repo |
What pypy version? |
According to the build log: Test features "NEON" : Unsupported due to Implied feature "NEON_FP16" is not supported
Test features "NEON_FP16" : Unsupported due to Implied feature "NEON_VFPV4" is not supported
Test features "NEON_VFPV4" : Unsupported due to Compiler fails against the test code of "NEON_VFPV4"
Test features "ASIMD" : Unsupported due to Implied feature "NEON" is not supported
Test features "ASIMDHP" : Unsupported due to Implied feature "NEON" is not supported
Test features "ASIMDFHM" : Unsupported due to Implied feature "NEON" is not supported
Configuring npy_cpu_dispatch_config.h using configuration
Message:
CPU Optimization Options
baseline:
Requested : min
Enabled :
dispatch:
Requested : max -xop -fma4
Enabled : The compile-time test for NEON_VFPV4 located at https://github.com/numpy/numpy/blob/main/numpy/distutils/checks/cpu_neon_vfpv4.c seems to have failed. Is there a way we can access the meson build log to pinpoint the problem? Perhaps by using something like |
Also in the OP already: pypy 7.3.12 (for python 3.9, as numpy doesn't support 3.8 anymore, and conda-forge doesn't build pypy for 3.10 yet) |
I think these are always going to fail because we're cross-compiling on aarch. It sounds conceptually similar to #24414, which where we have to set an explicit value for cross-compilation to pass for osx-arm. |
The hash in the OP didn't agree with the latest pypy release, that is why I asked. |
Not sure what you mean. It's both the latest available pypy version (7.3.12) and the latest available build for windows in conda-forge ( |
These are only "compile-time tests", and 100% friendly with cross-compiling. Each feature is typically tested against the compiler and assembler before enabling, to ensure the build doesn't break. Would you please provides the meson build log which located at |
The build log is pretty long (~11k lines 😱), so it didn't fit into a comment; I uploaded it as a gist. |
I think we should separate the two issues: the aarch64 failures are not connected to the PyPy ones. |
@charris as just discussed, the fix for this should go into 1.26.0 - PR coming. |
Thank you @h-vetinari for the build log, the compile-time tests failures of cpu features caused due to count on build compiler rather than the host, see: Command line: `$BUILD_PREFIX/bin/x86_64-conda-linux-gnu-cc $SRC_DIR/numpy/distutils/checks/cpu_neon_vfpv4.c -o /tmp/tmps4qcrzxi/output.exe -D_FILE_OFFSET_BITS=64` -> 1
stderr:
$SRC_DIR/numpy/distutils/checks/cpu_neon_vfpv4.c:4:10: fatal error: arm_neon.h: No such file or directory
4 | #include <arm_neon.h>
| ^~~~~~~~~~~~
compilation terminated. and it should be fixed by the following patch: |
Thanks @seiko2plus, that's a simple enough fix:) I'll incorporate it in |
This resulted in two test failures showing significant accuracy issues for the 1.26.0rc1 with the Linux aarch64 builds in conda-forge. Closes numpygh-24660
gh-24698 should close the important part of this issue, which is the Linux aarch64 accuracy issues. The PyPy on Windows + memory allocator / Cython / f2py are all the same issue it looks like:
I'm not sure why |
This resulted in two test failures showing significant accuracy issues for the 1.26.0rc1 with the Linux aarch64 builds in conda-forge. Closes numpygh-24660
Thanks a lot @seiko2plus! Is this going to be upstreamed into meson? Because if I mark the openblas 0.3.24 builds unbroken again on aarch, this would also rebreak the CI on the scipy-feedstock (which, being built by meson, is probably affected by the same issue). In other words, we'd at least need to patch this into the meson-feedstock, but I'd prefer to do just a backport of an upstream PR, rather than original patches. |
Agreed, issues with pypy on windows would not have kept us from releasing (on the conda-forge side at least) |
This bug is in code that is not present in Meson and not relevant for SciPy because it doesn't use NEON SIMD. |
Interesting... That means we're still lacking an understanding of what's causing scipy/scipy#19210, and I cannot mark openblas 0.3.24 unbroken again. In any case, nothing that should be blocking 1.26 |
The issues seem to stem from the compiler optimization being disabled by setting the optimize level to 0. Upon inspecting the provided meson log, I noticed that -O0 is specified after both -O3 and -O2, which is unusual. Check the following command line: Command line: `$BUILD_PREFIX/bin/aarch64-conda-linux-gnu-cc -L$PREFIX/lib $SRC_DIR/builddir/meson-private/tmp3f10l7c1/testfile.c -o $SRC_DIR/builddir/meson-private/tmp3f10l7c1/output.exe -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt \
-O3 \ # optimization level 3 -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/numpy-1.26.0rc1 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -DNDEBUG -D_FORTIFY_SOURCE=2 \
-O2 \ # optimization level 2 overrides the O3 -isystem $PREFIX/include -D_FILE_OFFSET_BITS=64 \
-O0 \ # And here the issue ? the compiler optimization is turned off? -std=c99 \
-Wl,-O2 \ # passing O2 to linker? void it.
-Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--allow-shlib-undefined -Wl,-rpath,$PREFIX/lib -Wl,-rpath-link,$PREFIX/lib` -> 1 |
It seems that the Meson build reverses the order of the CFLAGS for somehow and this may explains why we don't observe this issue in the distutils build. |
Conda manages to make things a little messy with a ton of build flags, but I think we can unravel this. We have the following components of the puzzle for a C target:
It's unclear to me where To get the compile and link args for a target, you can use
Comparing that to a non-SIMD compiler feature check and a SIMD feature check from
Those do both contain |
@seiko2plus are those actual problems for feature detection? It's working fine in all CI and locally, with |
Thanks for clarifying. I was under the impression -O0 was applied to all sources, as I couldn't access the ninja debug log via CI.
The presence of -O0 isn't a problem for feature detection. In fact, it's beneficial to include |
Great, thanks for confirming. Then I think we're all good here. |
This resulted in two test failures showing significant accuracy issues for the 1.26.0rc1 with the Linux aarch64 builds in conda-forge. Closes numpygh-24660
Describe the issue:
While our CI for 1.26.0b1 was all green, for rc1 we have two issues in conda-forge/numpy-feedstock#297:
linux-aarch64
has two test failures that don't just seem like minor tolerance violations:and pypy on windows, which seems to have some issues with cython resp. f2py:
Reproduce the code example:
Error message:
Runtime information:
Test env aarch:
Test env pypy on windows
Context for the issue:
No response
The text was updated successfully, but these errors were encountered: