Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b2986
readme : remove trailing space (#7469)
b2879
server: free sampling contexts on exit (#7264) * server: free sampling contexts on exit This cleans up last leak found by the address sanitizer. * fix whitespace * fix whitespace
b2821
JSON: [key] -> .at(key), assert() -> GGML_ASSERT (#7143)
b2809
compare-llama-bench.py: add missing basicConfig (#7138) * compare-llama-bench.py: add missing basicConfig * compare-llama-bench.py: Add line break between error message and print_help() * Add regular print() markdown table
b2786
If first token generated from the server is the stop word the server …
b2724
llama : add llama_get_pooling_type function (#6862) * add llama_get_pooling_type function * fix argument name, move with ctx funcs
b2710
`build`: generate hex dump of server assets during build (#6661) * `build`: generate hex dumps of server assets on the fly * build: workaround lack of -n on gnu xxd * build: don't use xxd in cmake * build: don't call xxd from build.zig * build: more idiomatic hexing * build: don't use xxd in Makefile (od hackery instead) * build: avoid exceeding max cmd line limit in makefile hex dump * build: hex dump assets at cmake build time (not config time)
b2690
llamafile : tmp disable + build sgemm.o when needed (#6716) * build : sgemm.o only when needed ggml-ci * llamafile : tmp disable due to MoE bug ggml-ci
b2581
ci: bench: fix Resource not accessible by integration on PR event (#6…
b2548
[SYCL] Fix batched impl for NVidia GPU (#6164) * Fix batched impl * Maintain previous behaviour for igpu * retrigger CI --------- Co-authored-by: Abhilash Majumder <[email protected]>