Add initial how_to_scale_op notebook #65

DouglasOrr · 2024-08-14T08:11:11Z

My thoughts for the blog post were to keep it pretty short, with some key figures & results, rather than repeating all of this there.

Feedback of any sort is very welcome - thanks in advance!

thecharlieblake

This looks great - thanks so much! A few things came to mind, though I appreciate the blog version might address some of it anyway:

A bit more context for the un-initiated might be nice up-front - i.e. a para on what u-µP is for someone who's never heard of it
I had never heard of hardtanh before this example - a sentence introducing it and why you chose it might help the reader
Also a sentence explaining that our lib has these lovely scale_fwd and scale_bwd fns
Generally I'm finding the maths a bit hard to follow - particularly the jump to E[y^2]. I think we might benefit from spelling this out quite a bit more explicitly, even if it is more verbose. I'm guessing most ML people won't have done much of this kind of analysis before

Add initial how_to_scale_op notebook

352778d

DouglasOrr requested a review from thecharlieblake August 14, 2024 08:11

DouglasOrr self-assigned this Aug 14, 2024

Add copyright header

ef6c80c

thecharlieblake reviewed Aug 14, 2024

View reviewed changes

Address PR feedback

a91e833

DouglasOrr force-pushed the how-to-scale-notebook branch from 3452219 to a91e833 Compare August 15, 2024 14:59

Tidy-up

ffe9092

thecharlieblake approved these changes Aug 19, 2024

View reviewed changes

thecharlieblake merged commit a85d806 into main Aug 19, 2024
1 check passed

Provide feedback