Auto grad accum example #742

muellerzr · 2022-10-05T14:24:37Z

Showcases how to perform automatic gradient accumulation, similar to a popular feature that was toted by MosaicML awhile back.

The initial batch size starts at 256 to guarantee a CUDA OOM and the user can observe the print statement showing the new number of steps and batch size

HuggingFaceDocBuilderDev · 2022-10-05T14:28:06Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

sgugger

LGTM, thanks a lot!

examples/by_feature/automatic_gradient_accumulation.py

Co-authored-by: Sylvain Gugger <[email protected]>

Auto grad accum example

d602078

muellerzr added the enhancement New feature or request label Oct 5, 2022

muellerzr requested a review from sgugger October 5, 2022 14:24

Include auto grad accum to exlcusion list

5894872

sgugger approved these changes Oct 5, 2022

View reviewed changes

examples/by_feature/automatic_gradient_accumulation.py Outdated Show resolved Hide resolved

Typo fix calculate -> calculate

84265c7

Co-authored-by: Sylvain Gugger <[email protected]>

muellerzr merged commit 5fff81b into main Oct 5, 2022

muellerzr deleted the auto-grad-accum-v2 branch October 5, 2022 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto grad accum example #742

Auto grad accum example #742

muellerzr commented Oct 5, 2022

HuggingFaceDocBuilderDev commented Oct 5, 2022

sgugger left a comment

Auto grad accum example #742

Auto grad accum example #742

Conversation

muellerzr commented Oct 5, 2022

HuggingFaceDocBuilderDev commented Oct 5, 2022

sgugger left a comment

Choose a reason for hiding this comment