Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed precision errors for GPT2 and other models #959

Closed
mattdangerw opened this issue Apr 4, 2023 · 2 comments
Closed

Mixed precision errors for GPT2 and other models #959

mattdangerw opened this issue Apr 4, 2023 · 2 comments
Assignees
Labels
type:Bug Something isn't working

Comments

@mattdangerw
Copy link
Member

Already have a pull request, just opening this to track! #958

@mattdangerw mattdangerw added the type:Bug Something isn't working label Apr 4, 2023
@mattdangerw mattdangerw self-assigned this Apr 4, 2023
@ADITYADAS1999
Copy link
Contributor

Already have a pull request, just opening this to track! #958

hey @mattdangerw can you explain this a bit !

@mattdangerw
Copy link
Member Author

Sure thing! Basically Keras supports a mixed precision mode that is a substantial boost in performance for GPU training. https://www.tensorflow.org/guide/mixed_precision

But supporting it requires making sure our tensorflow ops do not assume an float32 computation type (we will use 16 bit floating point numbers instead). This PR that landed just removed a few places we were making too many assumptions about dtype, and breaking the "mixed precision" workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants