You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why is the AVX-512 instruction set required for CPU inference? This limits the CPUs to the more recent models (Intel since 2016, AMD since 2022) - especially the now most affordable first AMD Epyc server CPUs (Zen 1-3 architecture) only have AVX2. Older Epyc processors are nicely cheap and still offer 128 PCI-E lanes for networking.
So if would be nice to expand the CPU support to AVX2 which is the previous generation. Is the implementation difficult? I think llama.cpp supports AVX2 so maybe it could be taken from their code.
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
🚀 The feature, motivation and pitch
Why is the AVX-512 instruction set required for CPU inference? This limits the CPUs to the more recent models (Intel since 2016, AMD since 2022) - especially the now most affordable first AMD Epyc server CPUs (Zen 1-3 architecture) only have AVX2. Older Epyc processors are nicely cheap and still offer 128 PCI-E lanes for networking.
So if would be nice to expand the CPU support to AVX2 which is the previous generation. Is the implementation difficult? I think llama.cpp supports AVX2 so maybe it could be taken from their code.
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: