Add soft capping to reversible embedding layer #1718

mattdangerw · 2024-07-30T18:34:06Z

Forgetting the final output soft-cap is a really easy mistake, and worse, outputs will still look plausible for generations without the softcap, just with worse actual results.

Adding this to our reversible embedding layer will be much more robust. As long as you use the layer to compute logits over the vocab, you can no longer forget the soft-cap.

Before this fix, we were missing it from our actual CausalLM functional model output, meaning soft-capping was not applied during training!

Forgetting the final output soft-cap is a really easy mistake, and worse, outputs will still look plausible for generations without the softcap, just with worse actual results. Adding this to our reversible embedding layer will be much more robust. As long as you use the layer to compute logits over the vocab, you can no longer forget the soft-cap.

SamanehSaadat

Thanks, Matt!

Forgetting the final output soft-cap is a really easy mistake, and worse, outputs will still look plausible for generations without the softcap, just with worse actual results. Adding this to our reversible embedding layer will be much more robust. As long as you use the layer to compute logits over the vocab, you can no longer forget the soft-cap.

mattdangerw requested review from grasskin and SamanehSaadat July 30, 2024 18:34

mattdangerw force-pushed the logit-soft-cap-fix branch from 1a288cb to 7f5dc3b Compare July 30, 2024 18:43

SamanehSaadat approved these changes Jul 30, 2024

View reviewed changes

mattdangerw merged commit 7b932cd into keras-team:master Jul 30, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add soft capping to reversible embedding layer #1718

Add soft capping to reversible embedding layer #1718

mattdangerw commented Jul 30, 2024 •

edited

Loading

SamanehSaadat left a comment

Add soft capping to reversible embedding layer #1718

Add soft capping to reversible embedding layer #1718

Conversation

mattdangerw commented Jul 30, 2024 • edited Loading

SamanehSaadat left a comment

Choose a reason for hiding this comment

mattdangerw commented Jul 30, 2024 •

edited

Loading