-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-45876: Correctly rounded stdev() and pstdev() for the Decimal case #29828
Conversation
Update to 15 March
🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 1a6c58d 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 152ed3f 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Lib/statistics.py
Outdated
denominator = 1 << -q | ||
return numerator / denominator # Convert to float | ||
|
||
|
||
def _decimal_sqrt_of_frac(n: int, m: int) -> Decimal: | ||
"""Square root of n/m as a float, correctly rounded.""" | ||
# Premise: For decimal, computing (n/m).sqrt() can be off by 1 ulp. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, while the corresponding result is true for binary floating-point, it's not true for decimal floating-point: for decimal, the error can be more than 1 ulp.
Example using 3-digit precision:
>>> from decimal import Decimal, getcontext
>>> n, m = 209, 20
>>> getcontext().prec = 3
>>> (Decimal(n) / Decimal(m)).sqrt()
Decimal('3.22')
>>> getcontext().prec = 10
>>> (Decimal(n) / Decimal(m)).sqrt()
Decimal('3.232645975')
The correctly-rounded value here is around 1.265 ulps away, at 3.232645975...
However, I believe that one can show that the worst case error is at most 1.3ulp (actually 0.5 + sqrt(10)/4 = 1.290569... ulps), in which case it remains true that the correctly-rounded result will always be one of root
, root.next_plus()
or root.next_minus()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-correction: I guess "off by <= 1ulp" is accurate, so long as it's clear that we're talking about the maximum difference between the computed value and the correctly-rounded value. I was thinking in terms of the max difference between the computed value and the true mathematical value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the comment.
"""Square root of n/m as a float, correctly rounded.""" | ||
# Premise: For decimal, computing (n/m).sqrt() can be off by 1 ulp. | ||
# Method: Check the result, moving up or down a step if needed. | ||
if n <= 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be checking m
here rather than n
? It looks as though we multiply through by m
below, so we need m
to be positive for the inequalities to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case where m is zero is handled before the inequality test. The Decimal(n) / Decimal(m)
step raises DivisionByZero which is a subclass of ZeroDivisionError. There is a test for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was really more worried about the case m negative, not m zero. It seems more natural to have the normalization step n, m = -n, -m
ensure that the denominator is positive than that the numerator is positive: right now, we're effectively converting -2 / 5
to 2 / (-5)
, and what we care about below for the maths to work is that m
is positive, not that n
is positive: you're implicitly converting the inequality n / m > ((root + plus) / 2) ** 2
to the inequality n > ((root + plus) / 2)**2 * m
, and that conversion is only valid if m
is positive.
It doesn't matter, because if just one of n
and m
is negative then the (Decimal(n) / Decimal(m)).sqrt()
step will fail (except in the case of really extreme m
where the division rounds from something tiny and negative to negative zero); it just seems like a surprising choice and for me at least it made the code harder to read and reason about.
Lib/statistics.py
Outdated
denominator = 1 << -q | ||
return numerator / denominator # Convert to float | ||
|
||
|
||
def _decimal_sqrt_of_frac(n: int, m: int) -> Decimal: | ||
"""Square root of n/m as a float, correctly rounded.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copypasta: "as a float"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 309cb0a 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
* main: (21 commits) bpo-45876: Have stdev() also use decimal specific square root. (pythonGH-29869) bpo-45876: Correctly rounded stdev() and pstdev() for the Decimal case (pythonGH-29828) bpo-45711: Change exc_info related APIs to derive type and traceback from the exception instance (pythonGH-29780) bpo-30533:Add function inspect.getmembers_static that does not call properties or dynamic properties. (python#20911) bpo-45476: Disallow using asdl_seq_GET() as l-value (pythonGH-29866) bpo-45476: Add _Py_RVALUE() macro (pythonGH-29860) bpo-33381: [doc] strftime's %f option may pad zeros on the left or the right (pythonGH-29801) Fix EncodingWarning in Tools/freeze/test/freeze.py (pythonGH-29742) no-issue: remove unused import from test_graphlib.py (pythonGH-29853) bpo-45931: Prevent Directory.Build.props/targets from leaking from directories above the repo when building on Windows (pythonGH-29854) bpo-45653: fix test_embed on windows (pythonGH-29814) bpo-45917: Add math.exp2() method - return 2 raised to the power of x (pythonGH-29829) bpo-43905: Expand dataclasses.astuple() and asdict() docs (pythonGH-26154) bpo-44391: Remove unused argument from a varargs call. (pythonGH-29843) bpo-45881: configure --with-freeze-module --with-build-python (pythonGH-29835) bpo-45847: PY_STDLIB_MOD_SIMPLE now checks py_stdlib_not_available (pythonGH-29844) bpo-45828: Use unraisable exceptions within sqlite3 callbacks (FH-29591) bpo-40280: Emscripten systems use .wasm suffix by default (pythonGH-29842) bpo-45723: Sort the grand AC_CHECK_HEADERS check (pythonGH-29846) bpo-45847: Make socket module conditional (pythonGH-29769) ...
* main: (21 commits) bpo-45876: Have stdev() also use decimal specific square root. (pythonGH-29869) bpo-45876: Correctly rounded stdev() and pstdev() for the Decimal case (pythonGH-29828) bpo-45711: Change exc_info related APIs to derive type and traceback from the exception instance (pythonGH-29780) bpo-30533:Add function inspect.getmembers_static that does not call properties or dynamic properties. (python#20911) bpo-45476: Disallow using asdl_seq_GET() as l-value (pythonGH-29866) bpo-45476: Add _Py_RVALUE() macro (pythonGH-29860) bpo-33381: [doc] strftime's %f option may pad zeros on the left or the right (pythonGH-29801) Fix EncodingWarning in Tools/freeze/test/freeze.py (pythonGH-29742) no-issue: remove unused import from test_graphlib.py (pythonGH-29853) bpo-45931: Prevent Directory.Build.props/targets from leaking from directories above the repo when building on Windows (pythonGH-29854) bpo-45653: fix test_embed on windows (pythonGH-29814) bpo-45917: Add math.exp2() method - return 2 raised to the power of x (pythonGH-29829) bpo-43905: Expand dataclasses.astuple() and asdict() docs (pythonGH-26154) bpo-44391: Remove unused argument from a varargs call. (pythonGH-29843) bpo-45881: configure --with-freeze-module --with-build-python (pythonGH-29835) bpo-45847: PY_STDLIB_MOD_SIMPLE now checks py_stdlib_not_available (pythonGH-29844) bpo-45828: Use unraisable exceptions within sqlite3 callbacks (FH-29591) bpo-40280: Emscripten systems use .wasm suffix by default (pythonGH-29842) bpo-45723: Sort the grand AC_CHECK_HEADERS check (pythonGH-29846) bpo-45847: Make socket module conditional (pythonGH-29769) ...
https://bugs.python.org/issue45876