Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correct count(*) alias #7081

Merged
merged 6 commits into from
Jul 30, 2023
Merged

fix: correct count(*) alias #7081

merged 6 commits into from
Jul 30, 2023

Conversation

jackwener
Copy link
Member

@jackwener jackwener commented Jul 25, 2023

Which issue does this PR close?

Closes #6447.

Rationale for this change

What changes are included in this PR?

after replace COUNT(*), we should add a alias to keep name unchanged.

Are these changes tested?

Yes

Are there any user-facing changes?

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jul 25, 2023
Comment on lines 849 to 852
------Projection: COUNT(*) + Int64(2) AS cnt_plus_2, t2.t2_int
--------Filter: COUNT(*) = Int64(0)
----------Aggregate: groupBy=[[t2.t2_int]], aggr=[[COUNT(UInt8(1)) AS COUNT(*)]]
------------TableScan: t2 projection=[t2_int]
Copy link
Member Author

@jackwener jackwener Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mingmwang
look like old plan is wrong?
I'm not sure whether new plan is right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are referring to COUNT(*) I think it should always be 0

@jackwener jackwener force-pushed the count_name branch 2 times, most recently from 162b178 to 4501968 Compare July 25, 2023 11:47
@jackwener jackwener marked this pull request as ready for review July 25, 2023 11:47
@jackwener jackwener force-pushed the count_name branch 3 times, most recently from 66a006f to 4994970 Compare July 26, 2023 10:49
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @jackwener except for the subquery plan changes.

Perhaps this code needs some adjustment: https://github.com/apache/arrow-datafusion/blob/main/datafusion/optimizer/src/decorrelate.rs#L371-L406

"+--------------+--------------+-----------------+",
"| 10 | 110 | 20 |",
"+--------------+--------------+-----------------+",
"+--------------+--------------+----------+",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is certainly much nicer ❤️

Comment on lines 849 to 852
------Projection: COUNT(*) + Int64(2) AS cnt_plus_2, t2.t2_int
--------Filter: COUNT(*) = Int64(0)
----------Aggregate: groupBy=[[t2.t2_int]], aggr=[[COUNT(UInt8(1)) AS COUNT(*)]]
------------TableScan: t2 projection=[t2_int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are referring to COUNT(*) I think it should always be 0

@jackwener
Copy link
Member Author

This looks great @jackwener except for the subquery plan changes.

Agree with it.

Perhaps this code needs some adjustment: https://github.com/apache/arrow-datafusion/blob/main/datafusion/optimizer/src/decorrelate.rs#L371-L406

Thanks !

@jackwener jackwener marked this pull request as draft July 27, 2023 04:32
@jackwener jackwener marked this pull request as ready for review July 30, 2023 08:55
@apache apache deleted a comment from alamb Jul 30, 2023
@jackwener jackwener requested a review from alamb July 30, 2023 08:56
@jackwener
Copy link
Member Author

cc @jiangzhx

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome @jackwener -- thank you.

I think it is the mark of an excellent engineer when bugs are fixed by removing code. Very well done 🏆

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plans in this file look much better

use std::sync::Arc;

use crate::analyzer::AnalyzerRule;

pub const COUNT_STAR: &str = "COUNT(*)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@alamb alamb merged commit 01d7dba into apache:main Jul 30, 2023
@jackwener jackwener deleted the count_name branch July 30, 2023 12:14
@jiangzhx
Copy link
Contributor

cc @jiangzhx

Thanks for your help, @jackwener. Great job! I'm your big fan!!

@alamb
Copy link
Contributor

alamb commented Jul 31, 2023

Great job! I'm your big fan!!

Me too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot have column named "COUNT(*)"
3 participants