Cuda mmq deduplicate 4 #248

Nexesenex · 2024-07-20T10:46:00Z

No description provided.

Signed-off-by: thxCode <[email protected]>

* Add additional error information when model files fail to load. * Adding additional error information to most instances of fopen.

* llama : bump max layers from 256 to 512 * llama : replace asserts with exceptions

* ggml : fix iq4_nl dot product with odd number of blocks * ggml : fix odd blocks for ARM_NEON (ggerganov#8556) * ggml : fix iq4_nl dot product with odd number of blocks * ggml : fix q4_1 * ggml : fix q5_0 * ggml : fix q5_1 * ggml : fix iq4_nl metal ggml-ci * ggml : fix q4_0 * ggml : fix q8_0 ggml-ci * ggml : remove special Q4_0 code for first 2 blocks * ggml : fix sumf redefinition --------- Co-authored-by: slaren <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

mofosyne and others added 8 commits July 19, 2024 17:51

convert-*.py: add general.name kv override (ggerganov#8571)

3d0e436

fix: typo of chatglm4 chat tmpl (ggerganov#8586)

f299aa9

Signed-off-by: thxCode <[email protected]>

ggml : add friendlier error message to fopen errors (ggerganov#8575)

b57eb9c

* Add additional error information when model files fail to load. * Adding additional error information to most instances of fopen.

readme : fix server badge

be0cfb4

llama : bump max layers from 256 to 512 (ggerganov#8530)

d197545

* llama : bump max layers from 256 to 512 * llama : replace asserts with exceptions

convert-*.py: remove add_name from ChatGLMModel class (ggerganov#8590)

57b1d4f

CUDA: MMQ code deduplication + iquant support

f0f71a5

Nexesenex merged commit 26a91aa into Nexesenex:lcpp_pr_mmq_dedup Jul 20, 2024
8 of 11 checks passed

github-actions bot added Nvidia GPU testing python ggml labels Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuda mmq deduplicate 4 #248

Cuda mmq deduplicate 4 #248

Nexesenex commented Jul 20, 2024

Cuda mmq deduplicate 4 #248

Cuda mmq deduplicate 4 #248

Conversation

Nexesenex commented Jul 20, 2024