-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lora seems to be invalid when using vsft_llava.py #1786
Comments
cc @qgallouedec |
Thanks for reporting @shijian2001. I've encountered this error too. I will provide a fix asap. Feel free to open a PR if you manage to fix it. |
@qgallouedec Sorry, I haven't located the specific bug yet. |
@shijian2001 can you double-check your command? When running it I get another error: python examples/scripts/vsft_llava.py \
--dataset_name="HuggingFaceH4/llava-instruct-mix-vsft" \
--model_name_or_path="llava-hf/llava-1.5-7b-hf" \
--per_device_train_batch_size=8 \
--gradient_accumulation_steps=1 \
--output_dir="../logs/checkpoints/aug-vsft-llava-1.5-7b-hf" \
--gradient_checkpointing \
--remove_unused_columns=False \
--torch_dtype=float16 \
--fp16=True \
--use_peft=True \
--lora_r=64 \
--lora_alpha=16 \
--lora_target_modules="all-linear"
Related: #1785 (comment) Removing python examples/scripts/vsft_llava.py \
--dataset_name="HuggingFaceH4/llava-instruct-mix-vsft" \
--model_name_or_path="llava-hf/llava-1.5-7b-hf" \
--per_device_train_batch_size=8 \
--gradient_accumulation_steps=1 \
--output_dir="../logs/checkpoints/aug-vsft-llava-1.5-7b-hf" \
--gradient_checkpointing \
--remove_unused_columns=False \
--torch_dtype=float16 \
--use_peft=True \
--lora_r=64 \
--lora_alpha=16 \
--lora_target_modules="all-linear" It requires around 48 GB of VRAM. If you get an OOM error, trying reducing the batch size. |
@qgallouedec Thank you! However, when I followed your command and tried to set |
I used two A100 40g to fine-tune llava-7b with lora. When I used the lora vsft command you provided, I found that the error CUDA out of memory still appeared, so it seems that lora did not work.
My command is as follows, in which I have modified the dataset path:
The text was updated successfully, but these errors were encountered: