-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Count_Distinct
on a Boolean column in Snowflake
#10611
Labels
Milestone
Comments
Adding a special test to group_builder.specify "should work correctly with Boolean columns" <|
table = table_builder [["A", [True, True, True]], ["B", [False, False, False]], ["C", [True, False, True]], ["D", [Nothing, False, True]]]
t_with_nulls = table.aggregate columns=[..Count_Distinct "A", ..Count_Distinct "B", ..Count_Distinct "C", ..Count_Distinct "D"]
m1 = materialize t_with_nulls
m1.column_count . should_equal 4
m1.at "Count Distinct A" . to_vector . should_equal [1]
m1.at "Count Distinct B" . to_vector . should_equal [1]
m1.at "Count Distinct C" . to_vector . should_equal [2]
m1.at "Count Distinct D" . to_vector . should_equal [3]
t_without_nulls = table.aggregate columns=[..Count_Distinct "A" ignore_nothing=True, ..Count_Distinct "B" ignore_nothing=True, ..Count_Distinct "C" ignore_nothing=True, ..Count_Distinct "D" ignore_nothing=True]
m2 = materialize t_without_nulls
m2.column_count . should_equal 4
m2.at "Count Distinct A" . to_vector . should_equal [1]
m2.at "Count Distinct B" . to_vector . should_equal [1]
m2.at "Count Distinct C" . to_vector . should_equal [2]
# The NULL is ignored, and not counted towards the total
m2.at "Count Distinct D" . to_vector . should_equal [2] |
radeusgd
added a commit
that referenced
this issue
Jul 19, 2024
Additionally a test in |
Closed
radeusgd
added a commit
that referenced
this issue
Jul 22, 2024
mergify bot
pushed a commit
that referenced
this issue
Jul 23, 2024
- Closes #9486 - All tests are succeeding or marked pending - Created follow up tickets for things that still need to be addressed, including: - Fixing upload / table update #10609 - Fixing `Count_Distinct` on Boolean columns #10611 - Running the tests on CI is not part of this PR - to be addressed separately
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Currently, when trying to perform
Count_Distinct
aggregate
on aBoolean
column in the Snowflake backend, it fails with:Apparently the 'special null replacement value' that we use does not play well with the Boolean type.
The text was updated successfully, but these errors were encountered: