Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MINOR]: Do not add unnecessary hash repartition to the physical plan #7667

Merged
merged 3 commits into from
Sep 28, 2023
Merged

[MINOR]: Do not add unnecessary hash repartition to the physical plan #7667

merged 3 commits into from
Sep 28, 2023

Conversation

mustafasrepo
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

Sometimes in the physical plan I see RepartitionHash(input_partition=1, output_partition=1). In these cases, hash repartitioning is unnecessary. This PR adds a check for this case.

What changes are included in this PR?

Are these changes tested?

Yes, a new test is added to show that unnecessary hash repartition is not added to the physical plan. Even if operator requires hash repartitioning.

Are there any user-facing changes?

@mustafasrepo mustafasrepo changed the title Do not add unnecessary hash repartition to the physical plan [MINOR]: Do not add unnecessary hash repartition to the physical plan Sep 27, 2023
@github-actions github-actions bot added the core Core DataFusion crate label Sep 27, 2023
Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ozankabak
Copy link
Contributor

We did one more review round and added some better commenting as well.

I will go ahead and merge this, we can address any concerns in a follow-on PR should there be any.

@ozankabak ozankabak merged commit 22d03c1 into apache:main Sep 28, 2023
Ted-Jiang pushed a commit to Ted-Jiang/arrow-datafusion that referenced this pull request Oct 7, 2023
…apache#7667)

* Do not add unnecessary hash repartition

* Add new test, move satisfy check to inside add_hash method

* Improve comments

---------

Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants