-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoRA] Adds support for bias in LoRA #5733
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add an argument to the engine enable_lora_bias
and avoid initializing the bias tensors if it's false? If the user knows none of their loras will have bias, we can save memory.
@Yard1 Thanks for reviewing the PR. I have added the enable_lora_bias flag (default set to false), which prevents the allocation of lora bias tensors when false. |
Related: #5930 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, can we also add an e2e test?
@Yard1 Thanks for reviewing. I've added an e2e test for the lora_bias support. |
@followumesh you need to run |
@njhill I have addressed your comments above. Can you please review this again? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @followumesh and sorry for the delay.
There's one remaining small but major thing to fix (and tests are failing due to this).
vllm/lora/models.py
Outdated
if not self.lora_config.bias_enabled: | ||
module_lora.bias = None | ||
raise ValueError( | ||
f"Adapter bias cannot be used for {module_name}" | ||
" without --enable-lora-bias.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right and is causing blanket lora failures. I think it should be:
if not self.lora_config.bias_enabled: | |
module_lora.bias = None | |
raise ValueError( | |
f"Adapter bias cannot be used for {module_name}" | |
" without --enable-lora-bias.") | |
if module_lora.bias is not None and not self.lora_config.bias_enabled: | |
raise ValueError( | |
f"Adapter bias cannot be used for {module_name}" | |
" without --enable-lora-bias.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporated the comment.
): | ||
self.reset_lora(index) | ||
|
||
if self.tp_size > 1: | ||
lora_a = self.slice_lora_a(lora_a) | ||
lora_b = self.slice_lora_b(lora_b) | ||
if bias is not None: | ||
bias = self.slice_bias(bias) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK fair enough, I guess the typing errors are preexisting.
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggesting small change here to cover case that bias is a tensor rather than a list (from a typing pov the lora_module could be a LoRALayerWeights
rather than a PackedLoRALayerWeights
... not sure whether that will ever be the case in practice but no harm in having the check here cover it).
Also suggest a small change to the comment.
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @followumesh!
@followumesh there are a few failures in the existing LoRA tests which look related. |
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
@njhill All LoRA tests are sucessful now. |
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]> Signed-off-by: Dipika <[email protected]>
Thanks for complete this feature. I have two question about this featue:
|
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]> Co-authored-by: Umesh Deshpande <[email protected]>
Motivation
PEFT, https://github.com/foundation-model-stack/fms-hf-tuning includes support for tuning LoRA bias. This PR enables bias for lora, so the adapters with bias will work with vLLM.
Changes Included