Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoRA] Adds support for bias in LoRA #5733

Merged
merged 48 commits into from
Nov 12, 2024

Conversation

followumesh
Copy link
Contributor

Motivation
PEFT, https://github.com/foundation-model-stack/fms-hf-tuning includes support for tuning LoRA bias. This PR enables bias for lora, so the adapters with bias will work with vLLM.

Changes Included

  • LoRA bias support for different types of modules.
  • LoRA bias support for fully sharded LoRA.
  • Test file test-lora-bias.py

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add an argument to the engine enable_lora_bias and avoid initializing the bias tensors if it's false? If the user knows none of their loras will have bias, we can save memory.

@followumesh
Copy link
Contributor Author

@Yard1 Thanks for reviewing the PR. I have added the enable_lora_bias flag (default set to false), which prevents the allocation of lora bias tensors when false.

@njhill
Copy link
Member

njhill commented Jun 27, 2024

Related: #5930

Copy link
Collaborator

@Yard1 Yard1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, can we also add an e2e test?

@DarkLight1337
Copy link
Member

To speed up the CI queue for #5905, I've cancelled the distributed tests for the latest CI run in this PR since they won't pass anyway until #5905 has been merged. Please merge main into your branch after that happens so that the CI can pass once again.

@followumesh
Copy link
Contributor Author

@Yard1 Thanks for reviewing. I've added an e2e test for the lora_bias support.

@njhill
Copy link
Member

njhill commented Jul 29, 2024

@followumesh you need to run ./format.sh to fix the linting errors

@followumesh
Copy link
Contributor Author

@njhill I have addressed your comments above. Can you please review this again? Thanks

Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @followumesh and sorry for the delay.

There's one remaining small but major thing to fix (and tests are failing due to this).

Comment on lines 396 to 400
if not self.lora_config.bias_enabled:
module_lora.bias = None
raise ValueError(
f"Adapter bias cannot be used for {module_name}"
" without --enable-lora-bias.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right and is causing blanket lora failures. I think it should be:

Suggested change
if not self.lora_config.bias_enabled:
module_lora.bias = None
raise ValueError(
f"Adapter bias cannot be used for {module_name}"
" without --enable-lora-bias.")
if module_lora.bias is not None and not self.lora_config.bias_enabled:
raise ValueError(
f"Adapter bias cannot be used for {module_name}"
" without --enable-lora-bias.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated the comment.

):
self.reset_lora(index)

if self.tp_size > 1:
lora_a = self.slice_lora_a(lora_a)
lora_b = self.slice_lora_b(lora_b)
if bias is not None:
bias = self.slice_bias(bias)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK fair enough, I guess the typing errors are preexisting.

Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting small change here to cover case that bias is a tensor rather than a list (from a typing pov the lora_module could be a LoRALayerWeights rather than a PackedLoRALayerWeights ... not sure whether that will ever be the case in practice but no harm in having the check here cover it).

Also suggest a small change to the comment.

Umesh Deshpande added 2 commits November 11, 2024 15:08
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @followumesh!

@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 12, 2024
@njhill
Copy link
Member

njhill commented Nov 12, 2024

@followumesh there are a few failures in the existing LoRA tests which look related.

Umesh Deshpande added 2 commits November 11, 2024 22:02
Signed-off-by: Umesh Deshpande <[email protected]>
Signed-off-by: Umesh Deshpande <[email protected]>
@followumesh
Copy link
Contributor Author

@njhill All LoRA tests are sucessful now.

@njhill njhill merged commit 8a06428 into vllm-project:main Nov 12, 2024
55 checks passed
dsikka pushed a commit to neuralmagic/vllm that referenced this pull request Nov 13, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
Signed-off-by: Dipika <[email protected]>
@jeejeelee
Copy link
Collaborator

jeejeelee commented Nov 13, 2024

Thanks for complete this feature. I have two question about this featue:

  • Is this feature compatible with PEFT?
  • Have you done any benchmarking? Adding --enable-lora-bias seems to inevitably impact performance.

rickyyx pushed a commit to rickyyx/vllm that referenced this pull request Nov 13, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
sumitd2 pushed a commit to sumitd2/vllm that referenced this pull request Nov 14, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
Signed-off-by: Sumit Dubey <[email protected]>
KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
mfournioux pushed a commit to mfournioux/vllm that referenced this pull request Nov 20, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
tlrmchlsmth pushed a commit to neuralmagic/vllm that referenced this pull request Nov 23, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
Signed-off-by: Tyler Michael Smith <[email protected]>
sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024
Signed-off-by: Umesh Deshpande <[email protected]>
Co-authored-by: Umesh Deshpande <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants