Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make swa_lrs as required inside SWACallback #11822

Closed
rohitgr7 opened this issue Feb 9, 2022 · 1 comment · Fixed by #12556
Closed

Make swa_lrs as required inside SWACallback #11822

rohitgr7 opened this issue Feb 9, 2022 · 1 comment · Fixed by #12556
Labels
callback: swa feature Is an improvement or enhancement
Milestone

Comments

@rohitgr7
Copy link
Contributor

rohitgr7 commented Feb 9, 2022

Proposed Enhancement

Currently when swa_lrs is not set here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/e3820da28a0cd0982dd1c65d7da1a0e2180454c1/pytorch_lightning/callbacks/stochastic_weight_avg.py#L34-L38

we initialize it to the optimizer lrs here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/e3820da28a0cd0982dd1c65d7da1a0e2180454c1/pytorch_lightning/callbacks/stochastic_weight_avg.py#L167-L168

but during SWALR scheduler update, the values won't be updated because alpha here will be canceled out.

https://github.com/pytorch/pytorch/blob/bf233aa049c4b479fd6cb19f9b8672bb2d42b0e2/torch/optim/swa_utils.py#L281-L286

Motivation

If we keep it as it is, it will lead to issues like this: #9453

Pitch

Make swa_lrs in the callback as required since it's a required parameter in SWALR too and don't initialize it with any default.
https://github.com/pytorch/pytorch/blob/bf233aa049c4b479fd6cb19f9b8672bb2d42b0e2/torch/optim/swa_utils.py#L231

Additional context


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda @justusschock @awaelchli @akihironitta @rohitgr7 @carmocca

@rohitgr7 rohitgr7 added feature Is an improvement or enhancement refactor callback: swa labels Feb 9, 2022
@rohitgr7 rohitgr7 added this to the future milestone Feb 9, 2022
@rohitgr7 rohitgr7 removed the refactor label Feb 9, 2022
@rohitgr7 rohitgr7 modified the milestones: future, 1.7 Feb 23, 2022
@felipemello1
Copy link

Hi @rohitgr7 , I think that maybe the solution can be improved and simplified. As an user, I already have my learning rate scheduler. I dont want to have to figure out what my learning rate should be at epoch*0.8 and use this LR an starting point. Instead, if there is already a scheduler, just dont override it with SWA scheduler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
callback: swa feature Is an improvement or enhancement
Projects
None yet
2 participants