New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml: aarch64: implement mmla kernels for q8_0_q8_0, q4_0_q8_0 and q4_1_q8_1 quantized gemm #4966

Merged

ggerganov merged 5 commits into ggml-org:master from snadampal:smmla_aarch64

Feb 11, 2024

Merged

llama.cpp: add MATMUL_INT8 capability to system_info

Annotations

1 warning