Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid adding redundant SubqueryAlias. #4412

Closed
wants to merge 3 commits into from

Conversation

jackwener
Copy link
Member

Which issue does this PR close?

Closes #4383.

Rationale for this change

What changes are included in this PR?

Merge SubqueryAlias.

Are these changes tested?

test

  • merge two alias.
  • merge three alias (optimize again after optimize self)

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate optimizer Optimizer rules labels Nov 29, 2022
@mingmwang
Copy link
Contributor

mingmwang commented Nov 29, 2022

I think SubqueryAlias is just temp struct in the plan tree for scoping names. The SubqueryAlias should be removed totally from the plan tree at an early phase of the logical planing by modifying the inner plan's qualify names.
Then the rest of the other logical rules do not need to deal with SubqueryAlias anymore.

@mingmwang
Copy link
Contributor

In physical plan, there is no SubqueryAlias either. I think SubqueryAlias can be remove earlier to simply other logical rules.

@jackwener
Copy link
Member Author

I think SubqueryAlias is just temp struct in the plan tree for scoping names. The SubqueryAlias should be removed totally from the plan tree at an early phase of the logical planing by modifying the inner plan's qualify names.
Then the rest of the other logical rules do not need to deal with SubqueryAlias anymore.

Look like we can add it in planner.

When we add subquery alias, if child is subquery alias, we just need to change the name.

I don't have much preference for these two options.

@github-actions github-actions bot added logical-expr Logical plan and expressions and removed optimizer Optimizer rules labels Nov 29, 2022
@jackwener jackwener changed the title add a rule to merge SubqueryAlias. Avoid adding redundant SubqueryAlias. Nov 29, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems like an improvement to me -- we can further improve things in follow on PRs -- thanks @jackwener

@jackwener
Copy link
Member Author

jackwener commented Dec 1, 2022

wait for merge it.

I am curious about why we product the redundant Alias. If we do this optimization, it will make hard to find the reason.

@alamb alamb marked this pull request as draft December 1, 2022 11:30
@alamb
Copy link
Contributor

alamb commented Dec 1, 2022

Converting to a draft so we don't accidentally merge it

@jackwener
Copy link
Member Author

More optimization #4484 (comment)

I will try to do include #4484 (comment)

It's from comment @mingmwang in #4484

@alamb
Copy link
Contributor

alamb commented Dec 11, 2022

I wonder if this PR is still relevant

@jackwener
Copy link
Member Author

I wonder if this PR is still relevant

I should close it😂.
This optimization is more complicated than I thought at the beginning, I am ready to do more complete optimization.

@jackwener jackwener closed this Dec 11, 2022
@jackwener jackwener deleted the merge_subquery branch December 11, 2022 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add MergeSubqueryAlias rule
3 participants