You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that the Trainer flag truncated_bptt_steps doesn’t affect the validation phase. Should it? The problem I’m running into is that I need truncated_bptt_steps to virtually increase the length of the sequence I can fit into my GPU memory, but this purpose is defeated when the validation step doesn’t also make use of truncated_bptt_steps — in this case my memory is limited by my validation step, which attempts to process the full sequence in-memory.
Is there a reason why validation step shouldn’t also make use of the truncated_bptt_steps flag? Or is this just not finished yet? I could probably manage to adapt the evaluation_loop code to copy the training_loop code regarding truncated_bptt_steps. Would this be an acceptable addition?
🐛 Bug
It appears that the Trainer flag
truncated_bptt_steps
doesn’t affect the validation phase. Should it? The problem I’m running into is that I need truncated_bptt_steps to virtually increase the length of the sequence I can fit into my GPU memory, but this purpose is defeated when the validation step doesn’t also make use of truncated_bptt_steps — in this case my memory is limited by my validation step, which attempts to process the full sequence in-memory.Is there a reason why validation step shouldn’t also make use of the truncated_bptt_steps flag? Or is this just not finished yet? I could probably manage to adapt the evaluation_loop code to copy the training_loop code regarding truncated_bptt_steps. Would this be an acceptable addition?
See also: #6483
Please reproduce using the BoringModel
https://colab.research.google.com/drive/1JYAVdMy-UPg0Rzwpo3T9A-FmwrOdi_aR?usp=sharing
To Reproduce
Expected behavior
I expected the validation loop to also split the batch along the time dimension according to truncated_bptt_steps.
The text was updated successfully, but these errors were encountered: