-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate expressions after type coercion #3444
Conversation
Thanks @Dandandan. This looks like a great improvement. |
Codecov Report
@@ Coverage Diff @@
## master #3444 +/- ##
==========================================
+ Coverage 85.68% 85.69% +0.01%
==========================================
Files 298 298
Lines 54645 54667 +22
==========================================
+ Hits 46820 46846 +26
+ Misses 7825 7821 -4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
"+--------------+-------------------------+-------------------------+-------------------------+", | ||
"| 1.5 | 2.5 | 3.5 | 2.5 |", | ||
"+--------------+-------------------------+-------------------------+-------------------------+", | ||
"+--------------+---------------------------+---------------------------+---------------------------+", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have the comments about the header of the expr.
The input sql is AGG(C1) + 1
, 1
is the int64 data type, but the header is convert to float after casted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have the method to make the header consistent, and it can be changed with the changes of the optimizer plan.
cc @andygrove @alamb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a concern I also have for a longer time and had a PR open once.
One approach would be to add an alias for every unnamed expression based on the original query SQL or expression.
This would avoid having the column names changed by the optimizers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea of adding an alias once (maybe as the initial optimizer pass?)
I am not sure how valuable adding the types in the column names is in general, to be honest. I wouldn't mind if rather than Int(1)
this was simply rendered 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea of adding an alias once (maybe as the initial optimizer pass?)
I am not sure how valuable adding the types in the column names is in general, to be honest. I wouldn't mind if rather than
Int(1)
this was simply rendered1
Do you have plan or a draft pr for that? @Dandandan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps @Dandandan was referring to #280 / #279
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, we can give those a second life 🎉
I had some concerns with the PR, but I believe it is still a big improvement over the current state of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it looks like a good improvement to me. Perhaps we can file a follow on ticket for the column renaming?
"+--------------+-------------------------+-------------------------+-------------------------+", | ||
"| 1.5 | 2.5 | 3.5 | 2.5 |", | ||
"+--------------+-------------------------+-------------------------+-------------------------+", | ||
"+--------------+---------------------------+---------------------------+---------------------------+", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea of adding an alias once (maybe as the initial optimizer pass?)
I am not sure how valuable adding the types in the column names is in general, to be honest. I wouldn't mind if rather than Int(1)
this was simply rendered 1
@@ -653,7 +653,7 @@ order by | |||
let expected = "\ | |||
Sort: #revenue DESC NULLS FIRST\ | |||
\n Projection: #customer.c_custkey, #customer.c_name, #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS revenue, #customer.c_acctbal, #nation.n_name, #customer.c_address, #customer.c_phone, #customer.c_comment\ | |||
\n Aggregate: groupBy=[[#customer.c_custkey, #customer.c_name, #customer.c_acctbal, #customer.c_phone, #nation.n_name, #customer.c_address, #customer.c_comment]], aggr=[[SUM(#lineitem.l_extendedprice * CAST(Int64(1) AS Float64) - #lineitem.l_discount)]]\ | |||
\n Aggregate: groupBy=[[#customer.c_custkey, #customer.c_name, #customer.c_acctbal, #customer.c_phone, #nation.n_name, #customer.c_address, #customer.c_comment]], aggr=[[SUM(#lineitem.l_extendedprice * Float64(1) - #lineitem.l_discount)]]\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice
This PR appears to have some conflicts now |
@Dandandan This now needs a rebase |
Benchmark runs are scheduled for baseline = 97b3a4b and contender = f48a997. f48a997 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3431
Rationale for this change
See issue
What changes are included in this PR?
Are there any user-facing changes?