Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There should be a warning or an exception if using amp_backend="native" and setting an amp_level (it doesn't use it) #9739

Closed
JulesGM opened this issue Sep 28, 2021 · 1 comment · Fixed by #9755
Labels
feature Is an improvement or enhancement

Comments

@JulesGM
Copy link
Contributor

JulesGM commented Sep 28, 2021

🚀 Feature

There should be a warning or an exception if using Trainer(amp_backend="native") and setting a value to the amp_level argument (of Trainer).

Motivation

It's not obvious for someone who hasn't looked into the code or who knows a lot about Apex that using Pytorch's native amp backend that the amp_level argument will have no effect, giving a false understanding of what is actually happening. Knowing how much of the model was made fp16 is really important to understand what is happening in the training, to make training reproducible.

With amp_type="apex" (amp_level used):
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/connectors/accelerator_connector.py#L586
With amp_type="native" (not used):
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/connectors/accelerator_connector.py#L573

Pitch

There should be an exception or a warning added, maybe around https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/connectors/accelerator_connector.py#L573, when Trainer(amp_level=...) is set and Trainer(amp_backend="native"). Again, Knowing how much of the model was made fp16 is really important to understand what is happening in the training, to make training reproducible.

Alternatives

Rename amp_level to apex_level to make it more obvious that it only works with the apex backend. Breaks backwards compatibility.

@JulesGM JulesGM added the feature Is an improvement or enhancement label Sep 28, 2021
@JulesGM JulesGM changed the title There should be a warning or an exception if using the native amp backend and setting an amp_level (it doesn't use it) There should be a warning or an exception if using the amp_backend="native" and setting an amp_level (it doesn't use it) Sep 28, 2021
@JulesGM JulesGM changed the title There should be a warning or an exception if using the amp_backend="native" and setting an amp_level (it doesn't use it) There should be a warning or an exception if using amp_backend="native" and setting an amp_level (it doesn't use it) Sep 28, 2021
@JulesGM
Copy link
Contributor Author

JulesGM commented Sep 28, 2021

To keep the default amp_level behavior, the default of amp_level could be None in the constructor and set to O2 after amp_backend has been verified to be apex. Afaik this is a pretty standard way of doing things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant