Deepseek2 does not support K-shift Denial-of-Service vulnerability #10380

99991 · 2024-11-18T11:02:34Z

Long prompts/responses crash llama-server because "Deepseek2 does not support K-shift". For long prompts/responses, llama-server should return an error message or truncate the response, but instead, GGML_ABORT is called, which crashes the server. I believe that this is a Denial-of-Service vulnerability. A client should never be able to trigger GGML_ABORT.

The relevant line in the code is here:

https://github.com/ggerganov/llama.cpp/blob/9b75f03cd2ec9cc482084049d87a0f08f9f01517/src/llama.cpp#L18032

I have reported this security vulnerability almost three months ago here (link only visible for maintainers), but have received no response and it is public knowledge now anyway, so I also opened this issue to increase visibility.

Discussed in #9092

^{Originally posted by 99991 August 19, 2024}
It is my understanding that llama.cpp shifts the key-value cache when generating more tokens than fit into the context window, which is not supported for DeepSeek Coder V2. To reproduce, start a server with this model

./llama-server -m DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf -c 32 -ngl 999 --port 8080

and then request a prompt completion:

curl -H "Content-Type: application/json" --request POST --data '{"prompt": "Mergesort in Python:", "n_predict": 32}' http://127.0.0.1:8080/completion

This should trigger the error

src/llama.cpp:15646: Deepseek2 does not support K-shift
Aborted

with llama.cpp release b3600.

The corresponding code in llama.cpp is here:

https://github.com/ggerganov/llama.cpp/blob/cfac111e2b3953cdb6b0126e67a2487687646971/src/llama.cpp#L15643C31-L15648C1

I believe that a saner approach would simply stop generating tokens instead of crashing the server. Is there some option that can be set to prevent clients from crashing the server?

The text was updated successfully, but these errors were encountered:

99991 · 2024-11-18T11:41:52Z

See also this related issue, where it was missed that limiting the number of predicted tokens is ineffective, since a client can simply send a longer prompt to trigger this vulnerability. But either way, the sever should not be vulnerable by default. Even if there were command line flags to work around this bug, it would be insufficient. This is also demonstrated by the large number of downstream issues this is causing:

ngxson · 2024-11-18T15:43:10Z

You can also disable K-shift by disabling context shifting, via this argument: --no-context-shift

FireAngelx · 2024-12-17T15:09:56Z

@ggerganov Hi! I also find this problem in ollama, while I query a long text to deepseekV2, it would call the K-shift error, how could I set the param in ollama? Otherwise, I think that the model serve should not be crashed anyway.

SuperJunier666 · 2025-02-10T03:19:29Z

@ggerganov Hi! I also find this problem in ollama, while I query a long text to deepseekV2, it would call the K-shift error, how could I set the param in ollama? Otherwise, I think that the model serve should not be crashed anyway.你好！我在 ollama 中也发现了这个问题，当我向 deepseekV2 查询长文本时，它会调用K-shift错误，我该如何在ollama中设置参数呢？否则，我认为模型 serve 无论如何都不应该崩溃。

I had this problem too, have you solved it yet?

ggerganov mentioned this issue Nov 19, 2024

llama : add check for KV cache shifts #10401

Merged

ggerganov closed this as completed in #10401 Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepseek2 does not support K-shift Denial-of-Service vulnerability #10380

Deepseek2 does not support K-shift Denial-of-Service vulnerability #10380

99991 commented Nov 18, 2024

99991 commented Nov 18, 2024

ngxson commented Nov 18, 2024

FireAngelx commented Dec 17, 2024

SuperJunier666 commented Feb 10, 2025

Deepseek2 does not support K-shift Denial-of-Service vulnerability #10380

Deepseek2 does not support K-shift Denial-of-Service vulnerability #10380

Comments

99991 commented Nov 18, 2024

Discussed in #9092

99991 commented Nov 18, 2024

ngxson commented Nov 18, 2024

FireAngelx commented Dec 17, 2024

SuperJunier666 commented Feb 10, 2025