Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grad accumulation and memory bugfix #220

Merged
merged 12 commits into from
Mar 16, 2023
Merged

Conversation

edbeeching
Copy link
Collaborator

  • Adds command line arg passing to sentiment and toxicity examples
  • Adds gradient accumulation as a command line arg Add gradient accumulation #218
  • Fixes tensors not being detached from graph when stored in stats, leading to an overuse of memory.
  • Updates forward_batch_size -> minibatch size in a number of examples

By the way, I think there may be other places we use excessive memory due to storing attached tensors for too long. I will investigate further.

@edbeeching edbeeching requested a review from lvwerra March 14, 2023 12:32
@edbeeching
Copy link
Collaborator Author

I forgot to run style / quality. I am not on my dev machine at the moment. I will run this in an hour.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 14, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @edbeeching, looks very clean to me. Just one small comment.

@edbeeching edbeeching force-pushed the grad-accu-memory-bugfix branch from 7a19275 to 89c02aa Compare March 15, 2023 10:33
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this!
I left a single comment, otherwise the fix proposed in #216 will fail I belive

edbeeching and others added 2 commits March 15, 2023 21:15
@edbeeching edbeeching merged commit 768c389 into main Mar 16, 2023
@edbeeching edbeeching deleted the grad-accu-memory-bugfix branch March 16, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants