Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjustment of HashJoinExec APIs to Preserve Probe Side Order #6858

Merged
merged 4 commits into from
Jul 7, 2023
Merged

Adjustment of HashJoinExec APIs to Preserve Probe Side Order #6858

merged 4 commits into from
Jul 7, 2023

Conversation

metesynnada
Copy link
Contributor

Which issue does this PR close?

Closes #6857

Rationale for this change

In theory, we can maintain the order of records on the probe side of the hash join. This ordering preservation can be applied to Inner, RightSemi, and RightAnti joins.

The implementation changes in this pull request are designed to leverage this aspect of hash join operations, thereby eliminating unnecessary sort operations. Consequently, this enhancement is expected to result in improved performance, particularly in scenarios where the probe side of the hash join operation is already sorted.

What changes are included in this PR?

  1. Adjustment of HashJoinExec APIS: We have modified the HashJoinExec operation to preserve the order of its probe side in Inner, RightSemi, and RightAnti joins, thereby eliminating unnecessary sort operations.

  2. Bug Fix on Sort Pushdown Rule: We have fixed a minor bug in the Sort Pushdown rule.

Are these changes tested?

The new tests are included in thejoin_disable_repartition.sltfile.

Are there any user-facing changes?

No.

cc @Dandandan

@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jul 6, 2023
let physical_plan = sort_exec(vec![sort_expr("a", &join.schema())], join);

let expected_input = vec![
"SortExec: expr=[a@2 ASC]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

@mustafasrepo mustafasrepo merged commit 7c25bd0 into apache:main Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HashJoinExec can preserve probe side order
3 participants