⚡️ A Cheaper Sqrt Function #212

Gaussian-Process · 2022-04-18T20:34:16Z

Description

This modifies the existing sqrt implementation to remove the final three iterations from the binary search and add a linear estimate given the intermediate value. The estimate should be within a factor of ~3 of the true sqrt, which is still enough that 7 Babylonian method iterations suffice.

Checklist

Ensure you completed all of the steps below before submitting your pull request:

Ran forge snapshot?
Ran npm run lint?
Ran forge test?

Pull requests with an incomplete checklist will be thrown out.

…timate

transmissions11 · 2022-04-18T21:05:59Z

.gas-snapshot

@@ -141,7 +141,7 @@ FixedPointMathLibTest:testMulWadDownEdgeCases() (gas: 886)
 FixedPointMathLibTest:testMulWadUp() (gas: 959)
 FixedPointMathLibTest:testMulWadUpEdgeCases() (gas: 1002)
 FixedPointMathLibTest:testRPow() (gas: 2142)
-FixedPointMathLibTest:testSqrt() (gas: 2537)
+FixedPointMathLibTest:testSqrt() (gas: 2156)


transmissions11 · 2022-04-19T00:17:46Z

src/utils/FixedPointMathLib.sol

-            z := shr(12, mul(z, add(y, 65536))) // A multiply by 3 is saved from the initial z := 3
+            // The estimate sqrt(x) = (181/1024) * (x+1) is off by a factor of ~2.63 both when x=1
+            // and when x = 256 or 1/256. In the worst case, this needs seven Babylonian iterations.
+            z := shr(18, mul(z, add(y, 65536))) // A multiply is saved from the initial z := 181


is there overflow risk here?

Nope, since y is guaranteed to be smaller than 2^136 after the first branch above. I've made this clearer in a comment

transmissions11 · 2022-04-19T01:53:22Z

src/utils/FixedPointMathLib.sol

+            // The estimate sqrt(x) = (181/1024) * (x+1) is off by a factor of ~2.83 both when x=1
+            // and when x = 256 or 1/256. In the worst case, this needs seven Babylonian iterations.
+            // There is no overflow risk here since y < 2^136 after the first branch above.
+            z := shr(18, mul(z, add(y, 65536))) // A multiply is saved from the initial z := 181


forgive me for the stupid questions but where do 18 and 65536 come from? where's the 1024 thats mentioned in the comment?

I'll make this clearer in a comment too

awesome ty!

transmissions11 · 2022-04-19T01:56:37Z

src/utils/FixedPointMathLib.sol

+            // Correctness can be checked exhaustively for x < 256, so we assume y >= 256.
+            // Then z*sqrt(y) is within sqrt(257)/sqrt(256) of sqrt(x), or about 20bps.
+
+            // The estimate sqrt(x) = (181/1024) * (x+1) is off by a factor of ~2.83 both when x=1


am i missing something or is it only off by ~0.646 at 1?

https://www.desmos.com/calculator/k63qkvmg4m

and a lot more at 256

so e.g. at x=1, the estimate is 0.3535, which is ~1/2.83 of the correct value, i.e. off by a multiplicative factor of 2.83
I'll make this a bit clearer in a comment

ahhhh gotcha my bad not familiar with that terminology

transmissions11 · 2022-04-19T02:06:08Z

src/utils/FixedPointMathLib.sol

+            z := shr(18, mul(z, add(y, 65536))) // A multiply is saved from the initial z := 181
+
+            // Run the Babylonian method seven times. This should be enough given initial estimate.
+            // Possibly with a quadratic/cubic polynomial above we could get 4-6.


whats the reason not to do this haha? esp if we could keep the iterations and remove more parts of the binary search (the jumps for the ifs are probably the most expensive part here)

so I think removing even one more binary search branch is a bit tricky - that would increase the range we need to estimate sqrt on from [1/256, 256) to [1/65536, 65536).

In that range, the best linear estimate is then off by a factor of ~12 or so, and would need 10 Babylonian iterations
If a quadratic estimate was within a factor of ~5.5 across the whole range, 8 Babylonian iterations would be enough, and maybe that's worth it?

I don't know a good way to find the best quadratic/cubic polynomials, so I'm not sure what their error bounds would look like

ah yeahh that bounds increase sounds like a PITA, this is already great can explore that at another time hehe

I tried one fewer binary branches, and the linear estimate 45/1024 with max error factor in [1/65536, 65536) of smaller than 11.4, which I think is just enough for 9 Babylonian iterations to suffice, but it looks like it uses slightly more gas unfortunately

ah interesting, thanks for investigating!

transmissions11 · 2022-04-19T02:29:32Z

amazing work thank you so much @Gaussian-Process, will get this merged asap!

Gaussian-Process added 3 commits April 18, 2022 13:26

Reduce sqrt gas by stopping binary search early and using a linear es…

ec028fc

…timate

npm run lint

5809217

update .gas-snapshot

69f2748

transmissions11 reviewed Apr 18, 2022

View reviewed changes

Gaussian-Process added 2 commits April 18, 2022 14:59

use a more symmetric linear estimate

258f0ca

fix typo

4eaefdb

transmissions11 reviewed Apr 19, 2022

View reviewed changes

add comment explaining that overflow is impossible

837de01

transmissions11 reviewed Apr 19, 2022

View reviewed changes

improve the comment explaining the linear estimate

359bc92

transmissions11 changed the base branch from main to v7 May 12, 2022 22:05

transmissions11 changed the title ~~A Cheaper Sqrt Function~~ ⚡️ A Cheaper Sqrt Function May 12, 2022

transmissions11 merged commit e1677c9 into transmissions11:v7 May 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ A Cheaper Sqrt Function #212

⚡️ A Cheaper Sqrt Function #212

Gaussian-Process commented Apr 18, 2022

transmissions11 Apr 18, 2022

transmissions11 Apr 19, 2022

Gaussian-Process Apr 19, 2022

transmissions11 Apr 19, 2022

transmissions11 Apr 19, 2022 •

edited

Loading

Gaussian-Process Apr 19, 2022

transmissions11 Apr 19, 2022

transmissions11 Apr 19, 2022 •

edited

Loading

transmissions11 Apr 19, 2022

Gaussian-Process Apr 19, 2022

transmissions11 Apr 19, 2022

transmissions11 Apr 19, 2022 •

edited

Loading

Gaussian-Process Apr 19, 2022

transmissions11 Apr 19, 2022

Gaussian-Process Apr 19, 2022

transmissions11 Apr 19, 2022

transmissions11 commented Apr 19, 2022

⚡️ A Cheaper Sqrt Function #212

⚡️ A Cheaper Sqrt Function #212

Conversation

Gaussian-Process commented Apr 18, 2022

Description

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

transmissions11 Apr 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

transmissions11 Apr 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

transmissions11 Apr 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

transmissions11 commented Apr 19, 2022

transmissions11 Apr 19, 2022 •

edited

Loading

transmissions11 Apr 19, 2022 •

edited

Loading

transmissions11 Apr 19, 2022 •

edited

Loading