-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mark appropriate window functions as nullable: false in logical plan #7638
Conversation
…ne experession and thus column nullablity when constructing window plan
@@ -114,19 +114,35 @@ pub enum BuiltInWindowFunction { | |||
|
|||
impl BuiltInWindowFunction { | |||
fn name(&self) -> &str { | |||
use BuiltInWindowFunction::*; | |||
use BuiltInWindowFunction as F; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flyby fix. Using wildcard here means that if a variant were deleted from the enum, the bottom-most variant in the match statement would become a variable catch-all.
use BuiltInWindowFunction as F; | ||
match self { | ||
F::RowNumber | F::Ntile | F::Rank | F::CumeDist => false, | ||
// the rest are assumed to be nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if there is a way that we can derive this from the definitions of the physical plans or elsewhere? Kinda sucks to have this information hardcoded this information in two places (right now, it exists also in the field
method definition on the BuildInWindowFunctionExpr
trait (example)
@@ -3315,3 +3315,23 @@ SELECT | |||
window1 AS (ORDER BY C3) | |||
ORDER BY C3 | |||
LIMIT 5 | |||
|
|||
# Create table with window functions that have nullable: false columns should result in correct schema during cross join |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests fail on main
but pass on this branch as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I couldn't include a test for the NTILE
window function because it failed. I filed a separate bug here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that starting with version 47.0.0 of Arrow, the tests referenced will no longer fail (without the fixes provided in this PR) because Arrow has stopped checking for schema nullability. For more context, visit issue comment.
Therefore, I plan to introduce tests that explicitly check for nullable: false in the logical schema where it is expected, such as in columns created via row_number(), rank(), and similar functions. I will address this tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @matthewgapp -- I started to look at this PR today, and it looks in general good to me, but I want to look into a few more things and I ran out of time; I plan to review it more carefully tomorrow
Related comment: #7636 (comment) |
See #7636 (comment) |
Marking as draft as I think #7694 may be fixing the root cause instead |
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
add nullability information to window function that is used to determine expression and thus column nullability when constructing window plan
Which issue does this PR close?
Closes #7636
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?