Skip to content

Releases: ngxson/llama.cpp

b2986

23 May 21:11
74f33ad
Compare
Choose a tag to compare
readme : remove trailing space (#7469)

b2879

14 May 16:11
4f02636
Compare
Choose a tag to compare
server: free sampling contexts on exit (#7264)

* server: free sampling contexts on exit

This cleans up last leak found by the address sanitizer.

* fix whitespace

* fix whitespace

b2821

08 May 20:25
c12452c
Compare
Choose a tag to compare
JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)

b2809

08 May 09:53
acdce3c
Compare
Choose a tag to compare
compare-llama-bench.py: add missing basicConfig (#7138)

* compare-llama-bench.py: add missing basicConfig

* compare-llama-bench.py: Add line break between error message and print_help()

* Add regular print() markdown table

b2786

04 May 13:09
03fb8a0
Compare
Choose a tag to compare
If first token generated from the server is the stop word the server …

b2724

24 Apr 17:06
b4e4b8a
Compare
Choose a tag to compare
llama : add llama_get_pooling_type function (#6862)

* add llama_get_pooling_type function

* fix argument name, move with ctx funcs

b2710

22 Apr 02:27
5cf5e7d
Compare
Choose a tag to compare
`build`: generate hex dump of server assets during build (#6661)

* `build`: generate hex dumps of server assets on the fly

* build: workaround lack of -n on gnu xxd

* build: don't use xxd in cmake

* build: don't call xxd from build.zig

* build: more idiomatic hexing

* build: don't use xxd in Makefile (od hackery instead)

* build: avoid exceeding max cmd line limit in makefile hex dump

* build: hex dump assets at cmake build time (not config time)

b2690

18 Apr 03:03
3b8f1ec
Compare
Choose a tag to compare
llamafile : tmp disable + build sgemm.o when needed (#6716)

* build : sgemm.o only when needed

ggml-ci

* llamafile : tmp disable due to MoE bug

ggml-ci

b2581

30 Mar 15:14
37e7854
Compare
Choose a tag to compare
ci: bench: fix Resource not accessible by integration on PR event (#6…

b2548

27 Mar 11:40
e82f9e2
Compare
Choose a tag to compare
[SYCL] Fix batched impl for NVidia GPU (#6164)

* Fix batched impl

* Maintain previous behaviour for igpu

* retrigger CI

---------

Co-authored-by: Abhilash Majumder <[email protected]>