Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: qwen cannot be quantized in vllm #10263

Closed
1 task done
yananchen1989 opened this issue Nov 12, 2024 · 4 comments
Closed
1 task done

[Bug]: qwen cannot be quantized in vllm #10263

yananchen1989 opened this issue Nov 12, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@yananchen1989
Copy link

Your current environment

gpu A10
vllm version: 0.6.3.post1

Model Input Dumps

No response

🐛 Describe the bug

for qwen series, such as Qwen/Qwen2.5-7B-Instruct, it seems that vllm cannot apply quantization to it.
no matther for bitsandbytes or awq .
even for unsloth version, unsloth/Qwen2.5-7B-Instruct-bnb-4bit it does not work either.

error message:
AttributeError: Model Qwen2ForCausalLM does not support BitsAndBytes quantization yet.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@yananchen1989 yananchen1989 added the bug Something isn't working label Nov 12, 2024
@jeejeelee
Copy link
Collaborator

jeejeelee commented Nov 12, 2024

The current release version indeed doesn't support this. It should be supported in the upcoming release version, see: #8941. You can either build from the main branch or refer to the latest code installation doc at: https://docs.vllm.ai/en/latest/getting_started/installation.html#install-the-latest-code

@yananchen1989
Copy link
Author

thanks.

could you also take a look at phi series such as microsoft/Phi-3.5-mini-instruct ?
same issue with qwen : bnb quantization error

@yananchen1989
Copy link
Author

@jeejeelee

@mgoin
Copy link
Member

mgoin commented Nov 12, 2024

Hi @yananchen1989, phi is also supported with bitsandbytes on vLLM main, so please wait for the next release

I have tested with

vllm serve unsloth/Phi-3.5-mini-instruct-bnb-4bit --quantization bitsandbytes --load-format bitsandbytes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants