-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
用v100单机多卡训练,都跑到同一张卡上了 #2010
Comments
@yuekaizhang 碰到过吗? |
补充一下git commit : 9804821 |
@dahu1 你好,请问这个问题有解决方式吗?我今天也遇到这个情况了。 |
在 |
@ziyu123 感谢!我这里也正常了 |
感谢 |
或者可以拉最新的代码,使用 torchrun 跑并行训练,见 https://github.com/wenet-e2e/wenet/pull/2020。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
使用v100来多卡训练,发现分配gpu都分配到同一张卡了,这个要如何处理,之前使用2080和3090训练,都没遇到过这种情况。
机器信息:
镜像 nvidia/cuda:11.7.1-devel-ubuntu20.04
python3.8
torch==1.13.0+cu117
torchaudio==0.13.0
The text was updated successfully, but these errors were encountered: