Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练dtu数据时第七个epoch验证时报错 #31

Closed
zhoouzhe opened this issue Apr 25, 2024 · 0 comments
Closed

关于训练dtu数据时第七个epoch验证时报错 #31

zhoouzhe opened this issue Apr 25, 2024 · 0 comments

Comments

@zhoouzhe
Copy link

跟作者团队同样是用两张卡运行,但两次都是在第七个epoch验证时报错,报错内容为:OMP: Error #100: Fatal system error detected.
OMP: System error #22: Invalid argument。在调整过验证频率后得到解决(验证频率除以3),猜想是否是因为validate函数中没有进行类似清除梯度缓存的操作,希望作者能解惑一下。

@zhoouzhe zhoouzhe closed this as completed May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant