Mixed precision errors for GPT2 and other models #959

mattdangerw · 2023-04-04T18:07:25Z

Already have a pull request, just opening this to track! #958

ADITYADAS1999 · 2023-04-07T13:12:56Z

Already have a pull request, just opening this to track! #958

hey @mattdangerw can you explain this a bit !

mattdangerw · 2023-04-07T18:59:28Z

Sure thing! Basically Keras supports a mixed precision mode that is a substantial boost in performance for GPU training. https://www.tensorflow.org/guide/mixed_precision

But supporting it requires making sure our tensorflow ops do not assume an float32 computation type (we will use 16 bit floating point numbers instead). This PR that landed just removed a few places we were making too many assumptions about dtype, and breaking the "mixed precision" workflow.

mattdangerw added the type:Bug Something isn't working label Apr 4, 2023

mattdangerw self-assigned this Apr 4, 2023

mattdangerw closed this as completed Apr 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed precision errors for GPT2 and other models #959

Mixed precision errors for GPT2 and other models #959

mattdangerw commented Apr 4, 2023

ADITYADAS1999 commented Apr 7, 2023

mattdangerw commented Apr 7, 2023

Mixed precision errors for GPT2 and other models #959

Mixed precision errors for GPT2 and other models #959

Comments

mattdangerw commented Apr 4, 2023

ADITYADAS1999 commented Apr 7, 2023

mattdangerw commented Apr 7, 2023