Skip to content

Releases: ngxson/llama.cpp

b4618

02 Feb 20:36
90f9b88
Compare
Choose a tag to compare
nit: more informative crash when grammar sampler fails (#11593)

b4617

02 Feb 19:16
864a0b6
Compare
Choose a tag to compare
CUDA: use mma PTX instructions for FlashAttention (#11583)

* CUDA: use mma PTX instructions for FlashAttention

* __shfl_sync workaround for movmatrix

* add __shfl_sync to HIP

Co-authored-by: Diego Devesa <[email protected]>

b4616

02 Feb 15:48
84ec8a5
Compare
Choose a tag to compare
Name colors (#11573)

It's more descriptive, use #define's so we can use compile-time
concatenations.

Signed-off-by: Eric Curtin <[email protected]>

b4615

02 Feb 09:59
bfcce4d
Compare
Choose a tag to compare
`tool-call`: support Command R7B (+ return tool_plan "thoughts" in AP…

b4614

02 Feb 09:45
6980448
Compare
Choose a tag to compare
Fix exotic ci env that lacks ostringstream::str (#11581)

b4613

02 Feb 08:45
ff22770
Compare
Choose a tag to compare
sampling : support for llguidance grammars (#10224)

* initial porting of previous LLG patch

* update for new APIs

* build: integrate llguidance as an external project

* use '%llguidance' as marker to enable llg lark syntax

* add some docs

* clarify docs

* code style fixes

* remove llguidance.h from .gitignore

* fix tests when llg is enabled

* pass vocab not model to llama_sampler_init_llg()

* copy test-grammar-integration.cpp to test-llguidance.cpp

* clang fmt

* fix ref-count bug

* build and run test

* gbnf -> lark syntax

* conditionally include llguidance test based on LLAMA_LLGUIDANCE flag

* rename llguidance test file to test-grammar-llguidance.cpp

* add gh action for llg test

* align tests with LLG grammar syntax and JSON Schema spec

* llama_tokenizer() in fact requires valid utf8

* update llg

* format file

* add $LLGUIDANCE_LOG_LEVEL support

* fix whitespace

* fix warning

* include <cmath> for INFINITY

* add final newline

* fail llama_sampler_init_llg() at runtime

* Link gbnf_to_lark.py script; fix links; refer to llg docs for lexemes

* simplify #includes

* improve doc string for LLAMA_LLGUIDANCE

* typo in merge

* bump llguidance to 0.6.12

b4611

01 Feb 18:51
53debe6
Compare
Choose a tag to compare
ci: use sccache on windows HIP jobs (#11553)

b4610

01 Feb 13:03
cfd74c8
Compare
Choose a tag to compare
`sync`: minja (https://github.com/google/minja/commit/418a2364b56dc9b…

b4609

01 Feb 11:12
ecef206
Compare
Choose a tag to compare
Implement s3:// protocol (#11511)

For those that want to pull from s3

Signed-off-by: Eric Curtin <[email protected]>

b4608

01 Feb 00:46
5bbc736
Compare
Choose a tag to compare
ci: simplify cmake build commands (#11548)