[Feature] support min_p sampling #1071

81549361 · 2024-08-13T09:44:52Z

Motivation

Motivation
The min_p sampling parameter is becoming quite popular. It's conceptually simple and "makes sense", and (at least anecdotally, according to opinions of many model fine-tuners and users in the LocalLlama community) it tends to perform better than the usual top_p+top_k approach. You can see the readmes of HF repositories of many new model finetunes/merges recommend to use min_p instead of top_p and top_k.

Some of the code has been implemented in flashinfer.
flashinfer-ai/flashinfer#422

Related resources
vLLM: https://github.com/vllm-project/vllm/blob/8ea5e44a435e8731fd6f5ba4c329dd112752532a/vllm/sampling_params.py#L64C9-L66C57
min_p: Float that represents the minimum probability for a token to be considered, relative to the probability of the most likely token. Must be in [0, 1]. Set to 0 to disable this.

So e.g. a min_p of 0.07 means that if a token that is less than 7% of the probability of the highest-probability token, it will be disqualified. A min_p of 0.5 would mean that if a token is not at least half the probability of the highest-probability token, then it is disqualified. Said another way, min_p allows you to set a minimum fraction of the most likely token's probability, else the token cannot be sampled.

vllm-project/vllm#1642
oobabooga/text-generation-webui#4449
ggml-org/llama.cpp#3841
Please see the above links for more info.

Related resources

No response

zhyncs · 2024-08-13T09:48:32Z

Yes, the corresponding kernel has already been implemented in FlashInfer. I think it shouldn't be too difficult to integrate into SGLang. Are you interested in submitting a PR? We highly welcome contributions!

81549361 · 2024-08-13T12:46:30Z

是的，相应的内核已经在 FlashInfer 中实现了。我认为集成到 SGLang 中应该不会太难。你有兴趣提交 PR 吗？我们非常欢迎贡献！

I'd love to help but I'm a newbie and I only know how to add min_p sampling but don't know how to use it with top k and top p at the same time.
81549361@79e8e8d

merrymercy · 2024-08-23T22:06:32Z

closed by #1167

intervitens mentioned this issue Aug 20, 2024

Support min-p sampling #1167

Merged

3 tasks

merrymercy closed this as completed Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] support min_p sampling #1071

[Feature] support min_p sampling #1071

81549361 commented Aug 13, 2024

zhyncs commented Aug 13, 2024

81549361 commented Aug 13, 2024

merrymercy commented Aug 23, 2024

[Feature] support min_p sampling #1071

[Feature] support min_p sampling #1071

Comments

81549361 commented Aug 13, 2024

Motivation

Related resources

zhyncs commented Aug 13, 2024

81549361 commented Aug 13, 2024

merrymercy commented Aug 23, 2024