-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mplug-owl3 doing full model sft bug report #2158
Comments
Thank you for your reply! |
Are you using the model from Modelscope or from Hugging Face? |
我的基础模型是从hugginging face上面下载的。
…---Original---
From: ***@***.***>
Date: Sun, Sep 29, 2024 13:12 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [modelscope/ms-swift] mplug-owl3 doing full model sft bug report(Issue #2158)
使用的是modelscope的模型还是huggingface的模型
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
有可能是我用的是huggingface模型,但是我没有设置hf=1?
我下午再试一下,主要是Lora是正常的
…---Original---
From: ***@***.***>
Date: Sun, Sep 29, 2024 13:12 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [modelscope/ms-swift] mplug-owl3 doing full model sft bug report(Issue #2158)
使用的是modelscope的模型还是huggingface的模型
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
For the first issue, could you please pull the latest repository from Hugging Face and try again? I have added a function named |
When I update the image_processing_mplugowl3.py from the huggingface latest repo and set USE_HF=1, all 2 Problems are solved! Thank you so much for your quick and valuable help! |
Describe the bug

There are 2 bugs! But that seems like these bugs only occur when doing full parameter finetune. When I use Lora, that seems like no such bugs.
1. If I train 1 epoch and do evaluation in training time after one val process finished, the coming training procedure will failed. The error message is follows:
2. If I train the model from checkpoint, which means I use the args:
--resume_from_checkpoint output/mplug-owl3-7b-chat/v34-20240929-110829/checkpoint-2 \
The error will occur as follows.The text was updated successfully, but these errors were encountered: