How Does GradientAccumulationScheduler behave if steps in an epoch is not evenly divisible by accumulation steps? #19605
Unanswered
jeffwillette
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment 1 reply
-
My particular task only has 3-5 steps in each epoch and I set batchsize to 2 and accumulation step to 4. So far, the performance seems correct and sometimes better than no accumulation. So I think we can assume the accumulation step counter will not be reset between epoches. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have been trying to answer this question by looking through the Trainer (https://lightning.ai/docs/pytorch/stable/_modules/lightning/pytorch/trainer/trainer.html#Trainer.__init__) implementation, but I have not found the answer yet. Specifically, say that...
I think there are two ways this could be handled, possibly:
I would assume that option 1 would be the best choice, but I have not been able to verify that this is the behavior. Does anyone know where to verify this?
Beta Was this translation helpful? Give feedback.
All reactions