Ballista: Finish implementing shuffle mechanism [DRAFT] #709
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #707 but is actually much larger than one issue because it turns out that the shuffle mechanism wasn't fully implemented and wasn't really being used.
With this PR, I now see the executors using hash partitioning in the shuffle writes.
Other changes:
Remaining work:
Rationale for this change
Shuffles were broken. The executor always ran the shuffle writes with partioning of
None
instead of the hash partitioning they were supposed to use. This information was never sent as part of the protobuf. Queries still worked correctly but this was not scaling since there was always a single partition.What changes are included in this PR?
Are there any user-facing changes?
No