-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update examples to show how to deal with extra validation copies #319
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Note: This is just an initial to make sure the format and whatnot looks right and then all the other examples will follow suite :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I'd just put this in a specific feature example instead of the base one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Not sure what your question is about tests. To test this we would need to know in advance the exact value of the metric and make sure we get that again but it's very finnicky since the metric computed in the base script is roughly the same.
Update examples to show how to truncate the validation set for metrics
What does this add?
Based off this issue this PR updates all examples to show how to get rid of extra samples that get added when performing distributed training on the validation set.
Testing on a multigpu system will happen tommorow, but @sgugger pretty sure the way I have it setup ensures that this only runs when we have distributed systems, and that's where this problem arises?
Who is it for?
Should close #287
Why is it needed?
It's unclear from the scripts how to alleviate this behavior, and it's not documented anywhere. So, with this PR it now is