llama : infill sampling handle very long tokens #9924

ggerganov · 2024-10-17T13:05:01Z

The infill sampler now handles correctly tokens with very long texts (e.g. line indentations). The token-merging logic should be more clear as well.

API changes

Remove the recently added llama_token_is_prefix. It was technically incorrect for very long tokens and I don't want to make it allocate too much stack memory as we have no upper bound for the token string length.

ggml-ci

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

ggerganov added 2 commits October 17, 2024 16:00

llama : infill sampling handle very long tokens

99c4a39

ggml-ci

cont : better indices

7899c67

ggml-ci

ggerganov mentioned this pull request Oct 17, 2024

server : add n_indent parameter for line indentation requirement #9929

Merged

ggerganov merged commit 99bd4ac into master Oct 17, 2024
60 checks passed

ggerganov deleted the gg/infill-4 branch October 17, 2024 19:32

drollings pushed a commit to drollings/llama.cpp that referenced this pull request Oct 18, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

b440d03

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

d0c3418

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

d066fa7

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

llama : infill sampling handle very long tokens (ggml-org#9924)

be7c08c

* llama : infill sampling handle very long tokens ggml-ci * cont : better indices ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : infill sampling handle very long tokens #9924

llama : infill sampling handle very long tokens #9924

ggerganov commented Oct 17, 2024

llama : infill sampling handle very long tokens #9924

llama : infill sampling handle very long tokens #9924

Conversation

ggerganov commented Oct 17, 2024

API changes