k-quants : fix quantization ranges #3646

ggerganov · 2023-10-16T19:16:57Z

I don't think this bug has caused any issues so far since we always quantize in the chunks with n == k:

https://github.com/ggerganov/llama.cpp/blob/317dc4bcc22c7cc79193727aea2366528d081035/ggml.c#L20725-L20756

* 'master' of github.com:ggerganov/llama.cpp: fix embeddings when using CUDA (ggml-org#3657) llama : avoid fprintf in favor of LLAMA_LOG (ggml-org#3538) readme : update hot-topics & models, detail windows release in usage (ggml-org#3615) CLBlast: Fix temporary buffer size for f16 conversion (wsize) train-text-from-scratch : fix assert failure in ggml-alloc (ggml-org#3618) editorconfig : remove trailing spaces server : documentation of JSON return value of /completion endpoint (ggml-org#3632) save-load-state : fix example + add ci test (ggml-org#3655) readme : add Aquila2 links (ggml-org#3610) tokenizer : special token handling (ggml-org#3538) k-quants : fix quantization ranges (ggml-org#3646) llava : fix tokenization to not add bos between image embeddings and user prompt (ggml-org#3645) MPT : support GQA for replit-code-v1.5 (ggml-org#3627) Honor -ngl option for Cuda offloading in llava (ggml-org#3621)

k-quants : fix quantization ranges

317dc4b

ggerganov merged commit 281ef73 into master Oct 17, 2023

ggerganov deleted the fix-k-quants branch October 17, 2023 06:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k-quants : fix quantization ranges #3646

k-quants : fix quantization ranges #3646

ggerganov commented Oct 16, 2023

k-quants : fix quantization ranges #3646

k-quants : fix quantization ranges #3646

Conversation

ggerganov commented Oct 16, 2023