-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update variance/stddev to work with single values #4847
Conversation
Following Postgres: - var/stddev of single element is NULL - var_pop/stddev_pop of single element is 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one minor inline comment.
Ok(ScalarValue::Float64(match self.count { | ||
0 => None, | ||
1 => { | ||
if matches!(self.stats_type, StatsType::Population) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: In this case I think if let
would be slightly more idiomatic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in bc65eeb
Hmm, some tests seem to be failing -- needs some investigation |
Ok, the tests are sorted out now. Thanks for taking a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jonmmease -- looks great to me
@ozankabak -- thank you as well for your help reviewing. It is really appreciated |
Benchmark runs are scheduled for baseline = dcd52ee and contender = 3d75bb8. 3d75bb8 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
* Update variance/stddev to work with single values Following Postgres: - var/stddev of single element is NULL - var_pop/stddev_pop of single element is 0 * Fix tests * matches! to if let * fix test_stddev_1_input test (cherry picked from commit 3d75bb8)
* Update variance/stddev to work with single values Following Postgres: - var/stddev of single element is NULL - var_pop/stddev_pop of single element is 0 * Fix tests * matches! to if let * fix test_stddev_1_input test (cherry picked from commit 3d75bb8)
Which issue does this PR close?
Closes #4843.
Rationale for this change
Rather than crash when performing a variance-based aggregation on a single value, follow Postgres' semantics:
var
/stddev
of single element isNULL
var_pop
/stddev_pop
of single element is 0There's documentation of this behavior in Snowflake (https://docs.snowflake.com/en/sql-reference/functions/stddev.html), and it's what Postgres does, but in searching briefly I haven't found this behavior described in the Postgres docs.
Are these changes tested?
I added a pair of sqllogictest tests for the one and two element cases.
Are there any user-facing changes?
Queries that used to crash when aggregating single values with variance-based aggregations will no longer crash.