-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
训练过程显存大小问题? #45
Comments
I remeber that 24G of GPU memory might not be enough for training, and that 32G of GPUmemory would be fine for training. |
您好,配置文件里用的12的batch_size用32GB可训的话,我将batch_size调整到4在24GB的4090上还是显存爆炸了,可以问下这是什么原因么 |
metoo |
do you figure it out now ? |
没有诶,调整到1都会爆炸hha就放弃了 |
@lewandofskee 希望可以回复一下(☆▽☆) |
24G的话如果不改变原来的配置那这是不够,不是batch_size的原因,模型参数太大的问题,如果没有更大的显存,继续用24G的话,只有改优化器了,AdamW改成SGD就不会发成显存爆炸的问题了,但这样会很慢 |
好的谢谢,没有考虑过优化器的问题,多谢~ |
Thanks for @zengyanjia comment. Indeed, the issue is not due to batch size but rather the large number of model parameters. |
想问你们训练起来之前 有没有遇到过这个问题,
|
From your screenshots so far, it's not a problem with the dataset, it's a problem with Clip. |
改为SGD得多久训出来啊,训了快200轮了还很差,跑完了1000轮SGD没跑出来,还不如AdamW跑了5轮的 |
您好,训练和推理该模型时,默认参数下,显存占用大概是多少呢?
The text was updated successfully, but these errors were encountered: