Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove training_step returned None user warning when automatic_optimization=False #6339

Closed
timothybrooks opened this issue Mar 3, 2021 · 7 comments · Fixed by #6139
Closed
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@timothybrooks
Copy link

🐛 Bug

When using manual optimization by setting the property automatic_optimization=False, it is not necessary (and sometimes undesirable) for training_step() to return a loss output. For example, in the case of GANs or other models with complex multi-optimizer setups, manual optimization can be preferable or necessary for the correct behavior. The user warning should therefore be omitted in this case.

Since the user calls the update step themselves (self.manual_backward(); optimizer.step()) it is not necessary for the training step to return a loss since PL does not need to use the loss to update weights. Furthermore, logging the returned output loss is unhelpful for certain complex multi-optimizer setups, since aggregation of losses from different optimizers is not desired.

In short, since the updates and logging when using automatic_optimization=False is non-standard and does not always involve returning a loss output, I believe the UserWarning("Your training_step returned None. Did you forget to return an output?") should be omitted in the case of manual optimization.

Basic example for GAN:

@property
def automatic_optimization(self) -> bool:
  return False

def training_step(...):
  loss_d = self.step_and_optimize_d(...)
  self.log("loss_d", loss_d)

  loss_g = self.step_and_optimize_g(...)
  self.log("loss_g", loss_g)

  # Note that the training step does not return a value.
@timothybrooks timothybrooks added bug Something isn't working help wanted Open to be worked on labels Mar 3, 2021
@carmocca
Copy link
Contributor

carmocca commented Mar 4, 2021

@timothybrooks
Copy link
Author

I am using the stable release 1.2, so if it has since been addressed in master that is great. In my version this block is nearly the same, but line 743 uses self.trainer.train_loop.automatic_optimization instead of self.automatic_optimization. Perhaps this was the cause of the issue since automatic_optimization migrated from a training flag to a module property? In any case, if the warnings are omitted correctly on master during manual optimization than it is no longer an issue.

@rohitgr7
Copy link
Contributor

rohitgr7 commented Mar 5, 2021

@timothybrooks are you training in distributed env (ddp, dp, ...)??

@timothybrooks
Copy link
Author

timothybrooks commented Mar 5, 2021 via email

@rohitgr7 rohitgr7 changed the title Remove straining_step returned None user warning when automatic_optimization=False Remove training_step returned None user warning when automatic_optimization=False Mar 6, 2021
@rohitgr7 rohitgr7 linked a pull request Mar 6, 2021 that will close this issue
11 tasks
@timothybrooks
Copy link
Author

Hello wanted to check in since I am still experiencing this bug in 1.2.6 when automatic_optimization=False and the training step returns none. Since the bug has been closed for a few weeks I wanted to check that it is still planned to be addressed. Thanks!

@akihironitta
Copy link
Contributor

akihironitta commented Apr 3, 2021

@timothybrooks I think we somehow missed the fix #6139. It should be included in the next release. It seems the fix is blocked by #4945, so it should be included in 1.3.

@timothybrooks
Copy link
Author

Got it, thanks for the update. Looking forward to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants