-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add small precision fast-path to Quo #98
Comments
So looking at the I benchmarked some decimal128 code based on https://software.intel.com/content/www/us/en/develop/articles/intel-decimal-floating-point-math-library.html by the way and division is comparable to multiplication there. So it must be possible for apd to have a more efficient |
Hi @aaronc, thanks for the report. |
Fixes cockroachdb#98. This commit speeds up `Context.Quo`, replacing its per-digit long division with a call to `BigInt.QuoRem`. This avoids per-digit iteration. It also allows `Context.Quo` to benefit further from the inline fast-path of `BigInt.QuoRem` for small coefficient values that fit in a uint64. This is a partial revert of 1eddda3, at least in spirit. Before that commit, `Context.Quo` did try to use `big.Int.Quo`. Unfortunately, it was inaccurate for certain values because it was not scaling the coefficients correctly before dividing. It tried to compensate for this by using a very large precision (`c.Precision*2 + 8`), but this was insufficient on its own because it was not taking the size of the values into account. So the commit abandoned that approach. However, that commit, which was based on the description given on the GDA site, did demonstrate how to scale the coefficients in a way that would permit this kind of division. Specifically, it began adjusting the operands so that the coefficient of the dividend was greater than or equal to the coefficient of the divisor and was also less than ten times the coefficient of the divisor. This commit uses the coefficient adjustment introduced in 1eddda3 to revive the call into `BigInt.QuoRem`. With the two operands adjusted, it now is possible to scale the coefficient of the dividend by the desired precision and then perform a direct division on the operands.
I've been doing some benchmarks to compare APD's performance vs other approaches we're considering (cosmos/cosmos-sdk#7772).
It seems that
Quo
operations perform pretty poorly compare toAdd
,Mul
, etc.:This benchmark is basically just perform a single operation on two random floating point numbers (different ones each round). I'm using a
Context
configured withdecimal128
parameters (i.e.Precision = 34
, etc.)Is
Quo
inherently expected to be 40x slower than other operations or could there be a performance bottleneck?The text was updated successfully, but these errors were encountered: