Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: skip EliminateCrossJoin rule if inner join with filter is found #7529

Merged
merged 2 commits into from
Sep 13, 2023

Conversation

epsio-banay
Copy link
Contributor

@epsio-banay epsio-banay commented Sep 12, 2023

Which issue does this PR close?

Closes #7530

Rationale for this change

Explained in issue

What changes are included in this PR?

EliminateCrossJoin optimization rule should be skipped for InnerJoin with filter, this PR fixes a bug were nested InnerJoins with filter do not cause the rule to be skipped.

Are these changes tested?

yes see eliminate_cross_not_possible_nested_inner_join_with_filter

Are there any user-facing changes?

There are no API changes but query results might differ.

@github-actions github-actions bot added the optimizer Optimizer rules label Sep 12, 2023
@epsio-banay epsio-banay force-pushed the eliminate-cross-join-skip-on-inner-join branch from 3ff466a to 542c10d Compare September 12, 2023 14:42
@epsio-banay epsio-banay marked this pull request as draft September 12, 2023 14:50
@epsio-banay epsio-banay marked this pull request as ready for review September 12, 2023 15:44
@epsio-banay epsio-banay marked this pull request as draft September 12, 2023 16:06
@epsio-banay epsio-banay force-pushed the eliminate-cross-join-skip-on-inner-join branch from 542c10d to 9e1fb13 Compare September 13, 2023 13:26
@epsio-banay epsio-banay force-pushed the eliminate-cross-join-skip-on-inner-join branch from 9e1fb13 to aa15cfb Compare September 13, 2023 13:29
@epsio-banay epsio-banay marked this pull request as ready for review September 13, 2023 14:20
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @epsio-banay -- @jackwener or @liukun4515 would you have time to review this PR?

@jackwener jackwener self-requested a review September 13, 2023 16:17
Copy link
Member

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @epsio-banay , good catch!

LGTM.

BTW, I'm thinking about this whole method. I think we don't need differentiate between handling equal conditions and non-equal conditions, because multi join is a Query Graph and both equal conditions and non-equal conditions are edges in Graph. But I also think this is a longer-term problem. I prepare to do this job in 1-2 weeks.

@jackwener jackwener changed the title Skip EliminateCrossJoin rule if inner join with filter is found - che… fix: skip EliminateCrossJoin rule if inner join with filter is found Sep 13, 2023
@alamb
Copy link
Contributor

alamb commented Sep 13, 2023

because multi join is a Query Graph and both equal conditions and non-equal conditions are edges in Graph. But I also think this is a longer-term problem. I prepare to do this job in 1-2 weeks.

Representing the joins in a query as a JoinGraph would be very interesting. I wonder if you have an idea of where would it be done?

It could be as a pass prior to the LogicalPlan:

sql text --> Sql AST --> JoinGraph --> LogicalPlan

or you could do it as part of an optimization pass, like this:

sql text --> Sql AST  --> LogicalPlan --(optimize with JoinGraph) --> LogicalPlan

Or something else I haven't thought of :)

@alamb alamb merged commit a758270 into apache:main Sep 13, 2023
@alamb
Copy link
Contributor

alamb commented Sep 13, 2023

Thanks @epsio-banay and @jackwener

@jackwener
Copy link
Member

#7549 to trace it

@jackwener
Copy link
Member

jackwener commented Sep 13, 2023

Representing the joins in a query as a JoinGraph would be very interesting. I wonder if you have an idea of where would it be done?

Because we just want to rearrange Inner/Outer join, Use MultiJoin is enough.

          MultiJoin
       /    |    |    \
child   child   child   child .....

Collect All Inner/Cross Join into a multijoin, and then split it to become multiple join


If you are interested in JoinGraph, there is a good example in MySQL-8.0. It use JoinGraph to present Join Tree, and use it to finish JoinReorder with a pretty good algorithm DPHyp.

https://github.com/mysql/mysql-server/blob/trunk/sql/join_optimizer/hypergraph.h

its process is

SQL --> LogicalPlan --(optimize with JoinGraph) --> LogicalPlan

@alamb
Copy link
Contributor

alamb commented Sep 14, 2023

🤔 would a multi-join work for a query like

SELECT ..
  FROM A JOIN B ON (a.x=b.x)
     JOIN C ON (a.y = c.y AND b.z = c.z)

A join graph might look like this

(A) -- y -- (B)
   \       /
    x     z
     \   /
      (C)

Would a multi-join look like

JOIN (a.x=b.x, a.y=c.y, b.z=c.z)
  (A, B, C)

I guess then any reordering could happen as a translation from Multi-join to binary join 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect results for nested joins after applying EliminateCrossJoin rule
3 participants