Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MINOR] Code refactor on hash join utils #6999

Merged
merged 4 commits into from
Jul 18, 2023
Merged

[MINOR] Code refactor on hash join utils #6999

merged 4 commits into from
Jul 18, 2023

Conversation

metesynnada
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

We are currently in the process of refactoring the join codebase to make it more amenable to adding new kinds of joins while maintaining code reuse. This involves moving some functions in between files to accommodate such new join executors. In this context, this PR makes some structural improvements to the codebase without introducing any substantial new functionality. As part of this reorganization, we are introducing an enumeration to control the required input distribution for SymmetricHashJoinExec.

What changes are included in this PR?

Refactoring the join codebase, focusing on making changes to the utilities and conducting some refactoring on the SymmetricHashJoinExec code.

To reduce ambiguity between the indices calculations between HashJoinExec and SHJ,

  • get_anti_indices -> get_pruning_anti_indices
  • get_semi_indices -> get_pruning_semi_indices

Also, we moved some functionality into functions named

  • build_filter_expression_graph

Are these changes tested?

With existing tests.

Are there any user-facing changes?

No

@github-actions github-actions bot added the core Core DataFusion crate label Jul 17, 2023
Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is PR for moving common code for different join operators to common location. It is LGTM!.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review this code carefully, but moving code around makes sense to me

cc @Dandandan in case you would like to review

@alamb alamb merged commit 58ad10c into apache:main Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants