Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorboard fails to log found lr when auto_lr_find is enabled. #3219

Closed
LeeJZh opened this issue Aug 27, 2020 · 4 comments
Closed

Tensorboard fails to log found lr when auto_lr_find is enabled. #3219

LeeJZh opened this issue Aug 27, 2020 · 4 comments
Labels
docs Documentation related help wanted Open to be worked on
Milestone

Comments

@LeeJZh
Copy link
Contributor

LeeJZh commented Aug 27, 2020

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. run job with --auto_lr_find enabled
  2. note the found lr
  3. open tensorboard hparams tab
  4. note the logged lr
  5. they are different, the logged lr is the default lr mannully assigned

Code sample

Expected behavior

the logged lr is the found lr

Environment

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py
# For security purposes, please check the contents of collect_env_details.py before running it.
python collect_env_details.py
  • PyTorch Version (e.g., 1.0): 1.6.0
  • OS (e.g., Linux): ubuntu 16.04
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source):
  • Python version: 3.7.7
  • CUDA/cuDNN version: 10.2
  • GPU models and configuration: V100
  • Any other relevant information:

Additional context

@LeeJZh LeeJZh added bug Something isn't working help wanted Open to be worked on labels Aug 27, 2020
@awaelchli
Copy link
Contributor

Hi, thanks for reporting this. I noticed you did not specify the PL version. Could you check that you are on 0.9, because we recently fixed a bug regarding the lr_find setting the learning rate attribute on hparams. #2821

@awaelchli awaelchli self-assigned this Aug 27, 2020
@LeeJZh
Copy link
Contributor Author

LeeJZh commented Aug 28, 2020

Hi, thanks for reporting this. I noticed you did not specify the PL version. Could you check that you are on 0.9, because we recently fixed a bug regarding the lr_find setting the learning rate attribute on hparams. #2821

yes pl version 0.9

@ddrevicky
Copy link
Contributor

I've looked at this and this actually has nothing to do with TensorBoard or any other logger. PR #3293 added a tune() method which extracted the learning rate finder out of fit(). Looking at it, it seems that William intended it to be used separately from fit().

Anyway, learning rate finder is not called in fit() at all so the learning rate the user sets is used (and logged by TensorBoard). So learning rate finder doesn't work now with fit() and also auto_scale_batch_size does not since it was also extracted to the tune() method.

@edenlightning edenlightning added docs Documentation related and removed bug Something isn't working good first issue Good for newcomers Hacktoberfest labels Oct 2, 2020
@edenlightning edenlightning modified the milestones: 0.9.x, 1.0 Oct 4, 2020
@vedal
Copy link

vedal commented Apr 16, 2021

I had to both set Trainer(..., auto_lr_find=True) and call trainer.tune(model, datamodule=datamodule) explicitly to make this work before calling trainer.fit() in version 1.2.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related help wanted Open to be worked on
Projects
None yet
Development

No branches or pull requests

6 participants