Releases · ngxson/llama.cpp

23 May 21:11

74f33ad

b2986

readme : remove trailing space (#7469)

Assets 21

14 May 16:11

github-actions

b2879

4f02636

b2879

server: free sampling contexts on exit (#7264)

* server: free sampling contexts on exit

This cleans up last leak found by the address sanitizer.

* fix whitespace

* fix whitespace

Assets 20

08 May 20:25

github-actions

b2821

c12452c

b2821

JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)

Assets 19

08 May 09:53

github-actions

b2809

acdce3c

b2809

compare-llama-bench.py: add missing basicConfig (#7138)

* compare-llama-bench.py: add missing basicConfig

* compare-llama-bench.py: Add line break between error message and print_help()

* Add regular print() markdown table

Assets 19

04 May 13:09

github-actions

b2786

03fb8a0

b2786

If first token generated from the server is the stop word the server …

Assets 19

24 Apr 17:06

github-actions

b2724

b4e4b8a

b2724

llama : add llama_get_pooling_type function (#6862)

* add llama_get_pooling_type function

* fix argument name, move with ctx funcs

Assets 19

22 Apr 02:27

github-actions

b2710

5cf5e7d

b2710

`build`: generate hex dump of server assets during build (#6661)

* `build`: generate hex dumps of server assets on the fly

* build: workaround lack of -n on gnu xxd

* build: don't use xxd in cmake

* build: don't call xxd from build.zig

* build: more idiomatic hexing

* build: don't use xxd in Makefile (od hackery instead)

* build: avoid exceeding max cmd line limit in makefile hex dump

* build: hex dump assets at cmake build time (not config time)

Assets 19

18 Apr 03:03

github-actions

b2690

3b8f1ec

b2690

llamafile : tmp disable + build sgemm.o when needed (#6716)

* build : sgemm.o only when needed

ggml-ci

* llamafile : tmp disable due to MoE bug

ggml-ci

Assets 18

30 Mar 15:14

github-actions

b2581

37e7854

b2581

ci: bench: fix Resource not accessible by integration on PR event (#6…

Assets 18

27 Mar 11:40

github-actions

b2548

e82f9e2

b2548

[SYCL] Fix batched impl for NVidia GPU (#6164)

* Fix batched impl

* Maintain previous behaviour for igpu

* retrigger CI

---------

Co-authored-by: Abhilash Majumder <[email protected]>

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ngxson/llama.cpp

b2986

b2879

b2821

b2809

b2786

b2724

b2710

b2690

b2581

b2548