Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm64: fix compilation failed when -mcpu is set by the toolchain #21117

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

clementperon
Copy link
Contributor

@clementperon clementperon commented Jun 20, 2024

Description

When building with a cross-toolchain some CXX_FLAGS could be set to a specific CPU.

This leads to a CPU not compatible with some specific functions compiles with ARCH v8.2+bf16

Error is :
cc1: error: switch '-mcpu=XXXX' conflicts with '-march=armv8.2-a+i8mm' switch

Motivation and Context

As these functions are dispatched at runtime and as there is no -mno-cpu, force the -mcpu to be valid by forcing it to a valid combination of -march and -mcpu

@snnn
Copy link
Member

snnn commented Jun 20, 2024

@clementperon
Copy link
Contributor Author

@snnn according to your article we should avoid -march and only use -mcpu.

What would you say about only keeping -mcpu in this patch ? This would also fix my issue

@lida2003
Copy link

lida2003 commented Jan 7, 2025

Hi, @clementperon, I have met this #23267 / v1.19.2 for Jetpack 5.1.4 L4T 35.6

Is this patch ok to build without "BFLOAT16" issue?

EDIT: I modified cmake/onnxruntime_mlas.cmake according to the patch, still got BFLOAT16 error.

-- Looking for reallocarray
-- Looking for reallocarray - found
-- Performing Test HAS_ARM64_BFLOAT16
-- Performing Test HAS_ARM64_BFLOAT16 - Failed
CMake Error at CMakeLists.txt:677 (message):
  The compiler doesn't support BFLOAT16!!!

EDIT2: Do I need extra flags in build command?

./build.sh --config Release --update --build --parallel --build_wheel \
--use_tensorrt --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu \
--tensorrt_home /usr/lib/aarch64-linux-gnu

EDIT3: difference between v1.19.2 and my onnx: 21117_for_v1.19.2.patch

@snnn
Copy link
Member

snnn commented Jan 7, 2025

@lida2003 , would you please consider upgrading your jetpack to 6.0(which has Ubuntu 22.04 and GCC 11)?

@lida2003
Copy link

lida2003 commented Jan 8, 2025

@snnn We are focusing ROS/ubuntu20.04, Jetpack 5.1.4 is the up to date release for this. JP6 doesn't support this. And I think there are quite a lot of users are ON JP5 now.

EDIT: #21383 - "JetPack 6.0 is not an option for Xavier or Nano users!"

@snnn
Copy link
Member

snnn commented Jan 8, 2025

Even ubuntu20.04 itself will reach EOL in a few months. Sorry we do not have bandwidth to maintain a build system for ubuntu20.04 and GCC 9. You may fork this repo and patch it in your way.

@snnn snnn requested a review from liqunfu January 8, 2025 00:20
@snnn
Copy link
Member

snnn commented Jan 8, 2025

@liqunfu , I am not familiar with mlas code. If you would like to take the changes, please help review the PR. Otherwise I will close it later.

@snnn snnn removed request for chenfucn and yufenglee January 8, 2025 00:22
@lida2003
Copy link

lida2003 commented Jan 8, 2025

You may fork this repo and patch it in your way.

@snnn That's OK.

Currently the treads end up here, but the patch is not working. Is there any flags or other links/help available for me?

EDIT: what's the root cause for this? in #21099 , it seems "Some ARCH64 machine doesn't have support for dotprod or bf16". Is this mean that it's gcc issue on the current ubuntu 20.04 gcc 9.4.0?

Which version has fixed this, any link? Or is there an alternative way to get around gcc 9.4.0 and get onnx build?

$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

EDIT2: In #22837 #19760, still some gcc version & bf16 issue. If onnx is tested bf16 with gcc 11,12 and 13, is it possible to remove bf16 when gcc version is less than 11 ?

@snnn
Copy link
Member

snnn commented Jan 8, 2025

@snnn
Copy link
Member

snnn commented Jan 8, 2025

I believe it was added in #14538

@clementperon
Copy link
Contributor Author

clementperon commented Jan 19, 2025

@snnn @lida2003

There is two issues here:

  • first the compiler that should be able to compile this new CPU/ARCH support
  • second the compiler/toolchain that already set some CPU specifications and create an invalid configuration (what I tried to fix here)

We could either:

  • compile these functions and detect at runtime that there will be not used
    • This will imply to drop the default C_FLAGS/CXX_FLAGS set by the toolchain
  • detect that the toolchain is not compatible and drop these functions to be compiled.

If we agree and what we want to do I could propose another patch.

@snnn
Copy link
Member

snnn commented Jan 21, 2025

compile these functions and detect at runtime that there will be not used

It is what we are doing.

@snnn
Copy link
Member

snnn commented Jan 21, 2025

Sorry I am not familiar with this part of code. We need someone from the @onnxruntime-mlas team to answer this question and review the code.

@snnn
Copy link
Member

snnn commented Jan 21, 2025

@liqunfu , would you be able to help?

@clementperon
Copy link
Contributor Author

@liqunfu , would you be able to help?

I will try to propose a new patch that will drop any -mcpu that is set by the toolchain

But maybe the cleanest way is to detect these functions as "not compilable" with the preferred toolchain flags and drop them.

Usually if you set a -mcpu in your toolchain you don't need a portable binary with runtime dispatch

@clementperon clementperon force-pushed the build_arm64 branch 3 times, most recently from c326def to a324834 Compare January 26, 2025 13:08
@clementperon
Copy link
Contributor Author

clementperon commented Jan 26, 2025

@snn @liqunfu

A patch to drop the -mcpu flags if it has been set by the toolchain. I will test it soon.

But IMO we should just check if the toolchain is able to compile with this -mcpu flags or not.
If it doesn't just drop these files.

If the user specificy the toolchain to have -mcpu=XXXX it means he want his library to be built for this mcpu and not for another one with runtime dispatch.

@clementperon clementperon changed the title arm64: force -mcpu to be valid arm64: fix compilation failed when -mcpu is set by the toolchain Jan 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants