RMSnorm Implementation #101

gdevos010 · 2022-08-07T15:58:31Z

Hi lucidrains,
I was looking at adding the ScaleNorm and RMSNorm to another repo, and the implementations look almost identical. I have linked to the official implementation below. Am I missing something about the implementation? Thanks for all the great work.

https://github.com/bzhangGo/rmsnorm

lucidrains · 2022-08-07T19:39:51Z

@gdevos010 Hi Greg

It is a bit subtle, but the only difference is that scale norm has one shared gamma multiplier across the entire feature dimension, while rms norm has gamma in the same dimension as the model dimensions https://github.com/lucidrains/x-transformers/blob/main/x_transformers/x_transformers.py#L352 vs https://github.com/lucidrains/x-transformers/blob/main/x_transformers/x_transformers.py#L363

lucidrains · 2022-08-07T19:40:18Z

@gdevos010 i would recommend rms norm, as it has been proven in a number of large language models out of deepmind

hrzn · 2022-08-10T12:19:47Z

Thanks, that makes sense. However looking at the scale norm paper, I'm wondering whether this scaling is needed; it seems that it's 1 in the paper (referring to Eq. (5) here), but I might be missing something of course.

lucidrains · 2022-08-10T16:57:16Z

@hrzn ohh actually yes that appears to be an error on my part! thank you for catching that!

lucidrains · 2022-08-10T17:02:02Z

@hrzn @gdevos010 here is a paper that does some head to head runs of the different types of normalizations https://arxiv.org/abs/2102.11972 may be informative for you two

hrzn · 2022-08-10T18:27:29Z

@hrzn @gdevos010 here is a paper that does some head to head runs of the different types of normalizations https://arxiv.org/abs/2102.11972 may be informative for you two

Oh nice, thanks. That's a very welcome paper!

gdevos010 mentioned this issue Aug 7, 2022

New alternatives to layer norm unit8co/darts#1114

Merged

lucidrains closed this as completed Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RMSnorm Implementation #101

RMSnorm Implementation #101

gdevos010 commented Aug 7, 2022 •

edited

Loading

lucidrains commented Aug 7, 2022

lucidrains commented Aug 7, 2022

hrzn commented Aug 10, 2022

lucidrains commented Aug 10, 2022

lucidrains commented Aug 10, 2022 •

edited

Loading

hrzn commented Aug 10, 2022

RMSnorm Implementation #101

RMSnorm Implementation #101

Comments

gdevos010 commented Aug 7, 2022 • edited Loading

lucidrains commented Aug 7, 2022

lucidrains commented Aug 7, 2022

hrzn commented Aug 10, 2022

lucidrains commented Aug 10, 2022

lucidrains commented Aug 10, 2022 • edited Loading

hrzn commented Aug 10, 2022

gdevos010 commented Aug 7, 2022 •

edited

Loading

lucidrains commented Aug 10, 2022 •

edited

Loading