Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto_scale_batch_size fails with datamodule in pl==1.2.*, succeeds in pl==1.1.8 #6335

Closed
colllin opened this issue Mar 3, 2021 · 2 comments
Labels
bug Something isn't working help wanted Open to be worked on priority: 1 Medium priority task tuner

Comments

@colllin
Copy link

colllin commented Mar 3, 2021

🐛 Bug

Running trainer = pl.Trainer(auto_scale_batch_size=True); trainer.tune(model, datamodule=dm) succeeds in pl==1.1.8, but fails in pl==1.2.* (tested both 1.2.0 and 1.2.1) with error:

Traceback (most recent call last):
  File "train.py", line 42, in <module>
    trainer.tune(model, datamodule=dm)
  File "/home/ubuntu/.local/share/virtualenvs/__project__/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1062, in tune
    self.tuner.tune(model, train_dataloader, val_dataloaders, datamodule)
  File "/home/ubuntu/.local/share/virtualenvs/__project__/lib/python3.8/site-packages/pytorch_lightning/tuner/tuning.py", line 46, in tune
    self.scale_batch_size(
  File "/home/ubuntu/.local/share/virtualenvs/__project__/lib/python3.8/site-packages/pytorch_lightning/tuner/tuning.py", line 104, in scale_batch_size
    return scale_batch_size(
  File "/home/ubuntu/.local/share/virtualenvs/__project__/lib/python3.8/site-packages/pytorch_lightning/tuner/batch_size_scaling.py", line 79, in scale_batch_size
    raise MisconfigurationException(f'Field {batch_arg_name} not found in both `model` and `model.hparams`')
pytorch_lightning.utilities.exceptions.MisconfigurationException: Field batch_size not found in both `model` and `model.hparams` 

Please reproduce using the BoringModel

https://colab.research.google.com/drive/1vgPLCwLg7uACtb3fxVp-t-__NtZ3onsD?usp=sharing

Expected behavior

Prior to pl==1.2.0, it successfully detected and tuned the dm.batch_size property.

Environment

  • CUDA:
    • GPU:
      • Tesla T4
    • available: True
    • version: 10.1
  • Packages:
    • numpy: 1.19.5
    • pyTorch_debug: False
    • pyTorch_version: 1.7.1+cu101
    • pytorch-lightning: 1.2.1
    • tqdm: 4.41.1
  • System:
    • OS: Linux
    • architecture:
      • 64bit
    • processor: x86_64
    • python: 3.7.10
    • version: Proposal for help #1 SMP Thu Jul 23 08:00:38 PDT 2020
@colllin colllin added bug Something isn't working help wanted Open to be worked on labels Mar 3, 2021
@akihironitta
Copy link
Contributor

@colllin Thank you for reporting the issue. I confirmed the bug in 1.2.* which didn't happen in <1.2. I think this issue should be addressed in #5968 by @awaelchli.

@rohitgr7
Copy link
Contributor

rohitgr7 commented Mar 5, 2021

closing this then. duplicate: #5967

@rohitgr7 rohitgr7 closed this as completed Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on priority: 1 Medium priority task tuner
Projects
None yet
Development

No branches or pull requests

3 participants