-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Renoir Zen 2 CPUs #36826
Comments
This patch seems to have been included in rc2, which was the version I tried... |
No it's not. Only the bug fix part of it was. |
I see, so support will only come in 1.6 then? |
It should be in 1.6. |
I tried a nighly and got this:
|
AFAICT our detection code is identical to LLVM's. And previously you saw a "better" result since the detection code misidentify it as zen1 and bypassing LLVM ones. If you can find out your CPUID it'll be pretty easy to add it to the list of znver2 ones. Also, for the openblas issue, you should report to https://github.com/xianyi/OpenBLAS instead. AFAICT we are using a version witht identical AMD CPU detectiton as the latest master there. |
CPUID:
Full info uploaded to cpu-world: http://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=71632 |
Regarding OpenBLAS, that might not necessarily be their fault, on Discourse, the OP installed the package from AUR and the architecture got detected properly:
EDIT: nevermind that, see OpenMathLib/OpenBLAS#2738 |
* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826
* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826
Thanks! Do you think the fix could be backported to 1.5? |
#36831 is for master only so it won't be backported unless is. And you aren't missing out much on 1.5. It's identified as zen 1 which affects the scheduling module a little bit but none of the feature detection are affected. |
* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826 (cherry picked from commit cd3fb4d)
I have a Renoir processor that's correctly detected as
|
Yes, that's OpenBLAS issue. Ref OpenMathLib/OpenBLAS#2738 . We need to carry patch and/or bump openblas version. Also, openblas is in general way too conservative on the dispatch. It appears to never dispatch based on features and only look for exact uarch match without a gental fallback... |
In the meantime you can use an environment variable as explained on discourse: |
Can confirm, I'm seeing (roughly) expected speeds now:
|
* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix JuliaLang#36826
* Missing feature from Apple A13 * Enable Cortex-A78 and Cortex-X1 on LLVM 11 llvm/llvm-project@954db63 https://reviews.llvm.org/D83206 * More relaxed Zen detection: treat all family 23 as Zen* and treat all model >= 0x30 as Zen2. GCC uses a similar fallback structure albeit based on feature. This should still generate **correct** code since that is always controlled by available features. It should be as good a scheduling model estimate as anything else. Fix #36826 (cherry picked from commit cd3fb4d)
On a laptop with a Ryzen 7 4700U CPU running Windows 10, this is what is being reported (also reported on Discourse under WSL2):
As far as I can tell, LLVM 9 should support
znver2
, and OpenBLAS should as well.The text was updated successfully, but these errors were encountered: