-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HashJoin order fixing #7155
HashJoin order fixing #7155
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -757,7 +757,7 @@ pub fn build_equal_condition_join_indices( | |||
let mut build_indices = UInt64BufferBuilder::new(0); | |||
let mut probe_indices = UInt32BufferBuilder::new(0); | |||
// Visit all of the probe rows | |||
for (row, hash_value) in hash_values.iter().enumerate() { | |||
for (row, hash_value) in hash_values.iter().enumerate().rev() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be reverted as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah because otherwise the order of is wrong - I get it. Could we add some comments for both changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I am on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for this issue #7113
I don't know why this issue is a bug?
Does it get the error result for the hash join?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there seems there was no test for checking the ordering of the hash join results (only for the matching rows to be correct) @metesynnada changed one to test this. This is important if the execution plan depends on the output ordering (and doesn't add extra sort).
The matching rows themselves are correct even before this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't merge this PR, I have some confused issue about this fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌 excellent
@liukun4515 Is there anything I can clarify for you about some points? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@metesynnada and I discussed this in detail and this PR LGTM as well. @liukun4515, do you still have any questions or worries?
Thanks @metesynnada and @ozankabak for the extra review. I felt certain that this is the correct change so went ahead with the merge. @liukun4515 let us know if you have any worries. |
In this issue #7113 @metesynnada describe the issue
But I review the code of From the implementation of |
In the JoinType::Inner match arm, we also preserve the left ordering. JoinType::Inner => {
// We modify the indices of the right order columns because their
// columns are appended to the right side of the left schema.
let mut adjusted_right_order =
adjust_right_order(right_order, left_len)?;
if let Some(left_order) = maybe_left_order {
adjusted_right_order.extend_from_slice(left_order);
}
Some(adjusted_right_order)
} Since the function is changed into (false, true) => {
// Special case, we can prefix ordering of left side with the ordering of right side.
if join_type == JoinType::Inner && probe_side == Some(JoinSide::Right) {
merge_vectors(&right_ordering, left_ordering)
} else {
right_ordering
}
} |
I know this changes about If we don't consider the special case
Does we can get the right result? |
Yes, the result will be correct but it would be sub-optimal. Consider the hash join where left table ordering is |
Thanks for your explanation. |
Which issue does this PR close?
Closes #7113.
Rationale for this change
Fixing the HashJoin order.
What changes are included in this PR?
Reversing indices.
Are these changes tested?
Yes, with existing tests.
Are there any user-facing changes?
No