Support per-request seed #1211

WoosukKwon · 2023-09-28T07:57:54Z

Although part of that problem is that there's no per-request seed, something we also really need.

Originally posted by @TheBloke in #866 (comment)

TheBloke · 2023-09-28T09:32:14Z

Yeah this would be very helpful.

I had a client who moved away from vLLM to TGI because of this. vLLM was giving 20% better throughput / requests per second, but there was a significant repetition problem with vLLM. If the user sent the same message again, the LLM would respond in exactly the same way. Here's a (made up) example of the sort of issue we'd see:

User: hello there
LLM: Hi there how can I help?
User: Who are you?
LLM: I am a helpful LLM here to be helpful for you
User: what can you do?
LLM: I can do many things, I can answer questions, I can write poetry, I can write stories. Whatever you want!
User: what can you do?
LLM: I can do many things, I can answer questions, I can write poetry, I can write stories. Whatever you want!

We found this was not affected at all by increasing frequency_penalty and presence_penalty, even as high as 2.0.

I haven't re-tested this since there were improvements to the repetition penalty controls, so maybe it's a bit better now.

But the fact that the seed is always the same for every request IMHO greatly increases the chance of this sort of repeated generation.

If the seed can be randomised on every request, then combined with a modest repetition_penalty value, it will be almost impossible to get repeated dialogue. With TGI, we found that with repetition_penalty=1.1, we never got repeats, even when the user says the same message 5 or 10 times in a row.

Thanks in advance for this enhancement!

winglian · 2023-10-10T17:19:24Z

Although part of that problem is that there's no per-request seed, something we also really need.

Originally posted by @TheBloke in #866 (comment)

is there somewhere you could point me to in the code where this would need to be implemented?

raihan0824 · 2023-12-18T09:10:59Z

any update on this?

WoosukKwon added the enhancement New feature or request label Sep 29, 2023

njhill mentioned this issue Jan 20, 2024

Support per-request seed #2514

Merged

simon-mo closed this as completed in #2514 Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support per-request seed #1211

Support per-request seed #1211

WoosukKwon commented Sep 28, 2023 •

edited

Loading

TheBloke commented Sep 28, 2023 •

edited

Loading

winglian commented Oct 10, 2023

raihan0824 commented Dec 18, 2023

Support per-request seed #1211

Support per-request seed #1211

Comments

WoosukKwon commented Sep 28, 2023 • edited Loading

TheBloke commented Sep 28, 2023 • edited Loading

winglian commented Oct 10, 2023

raihan0824 commented Dec 18, 2023

WoosukKwon commented Sep 28, 2023 •

edited

Loading

TheBloke commented Sep 28, 2023 •

edited

Loading