b3427 #249

Nexesenex · 2024-07-20T15:15:51Z

No description provided.

Signed-off-by: thxCode <[email protected]>

* Add additional error information when model files fail to load. * Adding additional error information to most instances of fopen.

* llama : bump max layers from 256 to 512 * llama : replace asserts with exceptions

* ggml : fix iq4_nl dot product with odd number of blocks * ggml : fix odd blocks for ARM_NEON (#8556) * ggml : fix iq4_nl dot product with odd number of blocks * ggml : fix q4_1 * ggml : fix q5_0 * ggml : fix q5_1 * ggml : fix iq4_nl metal ggml-ci * ggml : fix q4_0 * ggml : fix q8_0 ggml-ci * ggml : remove special Q4_0 code for first 2 blocks * ggml : fix sumf redefinition --------- Co-authored-by: slaren <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

* gguf_dump.py: fix markddown kv array print * Update gguf-py/scripts/gguf_dump.py Co-authored-by: compilade <[email protected]> * gguf_dump.py: refactor kv array string handling * gguf_dump.py: escape backticks inside of strings * gguf_dump.py: inline code markdown escape handler added >>> escape_markdown_inline_code("hello world") '`hello world`' >>> escape_markdown_inline_code("hello ` world") '``hello ` world``' * gguf_dump.py: handle edge case about backticks on start or end of a string --------- Co-authored-by: compilade <[email protected]>

* fix continuing generating blank lines after getting EOT token or EOS token from LLM * change variable name to is_done (variable name suggested by ggerganov) * minor : fix trailing whitespace * minor : add space --------- Co-authored-by: Georgi Gerganov <[email protected]>

* llama : Added support for Tekken pre-tokenizer (#8577) Removed uneeded `vocab.tokenizer_clean_spaces` assignment * llama : fix order of pre-tokenizers * * Tekken pre-tokenizer no longer uses clean_up_tokenization_spaces * Updated chkhsh for Tekken tokenizer --------- Co-authored-by: Georgi Gerganov <[email protected]>

65a and others added 13 commits July 18, 2024 17:47

cmake : install all ggml public headers (#8480)

705b7ec

Co-authored-by: 65a <[email protected]>

CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)

a15ef8f

convert-*.py: add general.name kv override (#8571)

3d0e436

fix: typo of chatglm4 chat tmpl (#8586)

f299aa9

Signed-off-by: thxCode <[email protected]>

ggml : add friendlier error message to fopen errors (#8575)

b57eb9c

* Add additional error information when model files fail to load. * Adding additional error information to most instances of fopen.

readme : fix server badge

be0cfb4

llama : bump max layers from 256 to 512 (#8530)

d197545

* llama : bump max layers from 256 to 512 * llama : replace asserts with exceptions

convert-*.py: remove add_name from ChatGLMModel class (#8590)

57b1d4f

gguf : handle null name during init (#8587)

07283b1

Nexesenex merged commit cc37f62 into Nexesenex:spacestream Jul 20, 2024
32 of 40 checks passed

github-actions bot added testing examples python ggml labels Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b3427 #249

b3427 #249

Nexesenex commented Jul 20, 2024

b3427 #249

b3427 #249

Conversation

Nexesenex commented Jul 20, 2024