b3230 #188

Nexesenex · 2024-06-25T21:20:44Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…rompt (#7950) * SimpleChat: Allow for chat req bool options to be user controlled * SimpleChat: Allow user to control cache_prompt flag in request * SimpleChat: Add sample GUI images to readme file Show the chat screen and the settings screen * SimpleChat:Readme: Add quickstart block, title to image, cleanup * SimpleChat: RePosition contents of the Info and Settings UI Make it more logically structured and flow through. * SimpleChat: Rename to apiRequestOptions from chatRequestOptions So that it is not wrongly assumed that these request options are used only for chat/completions endpoint. Rather these are used for both the end points, so rename to match semantic better. * SimpleChat: Update image included with readme wrt settings ui * SimpleChat:ReadMe: Switch to webp screen image to reduce size

* add chat template support for llama-cli * add help message * server: simplify format_chat * more consistent naming * improve * add llama_chat_format_example * fix server * code style * code style * Update examples/main/main.cpp Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

* remove completions file * fix inverted vector * add mean method * code style * remove inverted pca hotfix

…#8054) * gguf-dump: add --data-offset * gguf-dump: add tensor data offset table * gguf-dump: refactor GGUFReader for clarity * gguf-dump: add --data-alignment * gguf-dump.py: Rename variables and adjust comments start_data_offset --> data_offset _build_tensors_info_fields --> _build_tensor_info

* added healthcheck * added healthcheck * added healthcheck * added healthcheck * added healthcheck * moved curl to base * moved curl to base

…Maximum (#7797) * json: support minimum for positive integer values * json: fix min 0 * json: min + max integer constraints * json: handle negative min / max integer bounds * json: fix missing paren min/max bug * json: proper paren fix * json: integration test for schemas * json: fix bounds tests * Update json-schema-to-grammar.cpp * json: fix negative max * json: fix negative min (w/ more than 1 digit) * Update test-grammar-integration.cpp * json: nit: move string rules together * json: port min/max integer support to Python & JS * nit: move + rename _build_min_max_int * fix min in [1, 9] * Update test-grammar-integration.cpp * add C++11-compatible replacement for std::string_view * add min/max constrained int field to pydantic json schema example * fix merge * json: add integration tests for min/max bounds * reshuffle/merge min/max integ test cases * nits / cleanups * defensive code against string out of bounds (apparently different behaviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)

* llama : return nullptr from llama_grammar_init This commit updates llama_grammar_init to return nullptr instead of throwing an exception. The motivation for this is that this function is declared inside an extern "C" block and is intended/may be used from C code which will not be able to handle exceptions thrown, and results in undefined behavior. On Windows and using MSVC the following warning is currently generated: ```console C:\llama.cpp\llama.cpp(13998,1): warning C4297: 'llama_grammar_init': function assumed not to throw an exception but does C:\llama.cpp\llama.cpp(13998,1): message : __declspec(nothrow), throw(), noexcept(true), or noexcept was specified on the function ``` Signed-off-by: Daniel Bevenius <[email protected]> * squash! llama : return nullptr from llama_grammar_init Add checks for nullptr when calling llama_grammar_init. Signed-off-by: Daniel Bevenius <[email protected]> --------- Signed-off-by: Daniel Bevenius <[email protected]> Co-authored-by: Clint Herron <[email protected]>

…milies (#5763) * llama : add T5 model architecture, tensors and model header parameters * llama : add implementation of Unigram tokenizer with SentencePiece-like text normalization using precompiled charsmap --------- Co-authored-by: Stanisław Szymczyk <[email protected]>

…llama.cpp` [needs testing] (#8060) * fixes #7999 The `build_command_r` forgot to add the control vector. * Fixes qwen2 too * Fixed all models' control vectors * Removed double calls to `cb(cur, "l_out", il)` * Moved control vector logic to llama_control_vector:apply_to()

hanishkvc and others added 10 commits June 25, 2024 21:27

cvector: better prompt handling, add "mean vector" method (#8069)

49c03c7

* remove completions file * fix inverted vector * add mean method * code style * remove inverted pca hotfix

Add healthchecks to llama-server containers (#8081)

925c309

* added healthcheck * added healthcheck * added healthcheck * added healthcheck * added healthcheck * moved curl to base * moved curl to base

disable docker CI on pull requests (#8110)

dd047b4

github-actions bot added testing examples python server devops labels Jun 25, 2024

Nexesenex merged commit 40d2c9d into Nexesenex:marstream Jun 25, 2024
30 of 35 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b3230 #188

b3230 #188

Nexesenex commented Jun 25, 2024

b3230 #188

b3230 #188

Conversation

Nexesenex commented Jun 25, 2024