You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have done the QLoRA training with AWQ quantized base model, is that possible using the vLLM to load the AWQ base model and inference with the lora weights directly without merging ?
The text was updated successfully, but these errors were encountered:
Your current environment
installed vLLM + GPU env.
How would you like to use vllm
I have done the QLoRA training with AWQ quantized base model, is that possible using the vLLM to load the AWQ base model and inference with the lora weights directly without merging ?
The text was updated successfully, but these errors were encountered: