Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark: Fix stats in rewrite metadata action #5691

Merged
merged 3 commits into from
Sep 2, 2022

Conversation

rdblue
Copy link
Contributor

@rdblue rdblue commented Sep 1, 2022

This is an update to #5665. There was discussion on that PR about how to project the partition tuple to the correct final type, and this PR implements the update by passing the desired data file type into StructDataFile. I think this is the cleanest way to fix the problem.

Because StructDataFile does not implement StructLike and is instead passed into a ManifestWriter that uses the DataFile API, the projection can't happen in writeManifest. Inside StructDataFile, only the partition tuple's type can differ, so only that field needs to be projected.

@github-actions github-actions bot added the spark label Sep 1, 2022
@rdblue
Copy link
Contributor Author

rdblue commented Sep 1, 2022

@Fokko, I took a look at #5665 and got it working. I'm opening this so that you have the option to either cherry-pick my changes into your PR or just go ahead and merge this.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, much cleaner than my solution. Thanks for picking this up @rdblue

@rdblue
Copy link
Contributor Author

rdblue commented Sep 2, 2022

Thanks for the review, @Fokko! And nice work finding the problem.

@rdblue rdblue added this to the Iceberg 0.14.1 Release milestone Sep 2, 2022
rdblue added a commit to rdblue/iceberg that referenced this pull request Sep 2, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
rdblue added a commit to rdblue/iceberg that referenced this pull request Sep 2, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
rdblue added a commit to rdblue/iceberg that referenced this pull request Sep 2, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
rdblue added a commit to rdblue/iceberg that referenced this pull request Sep 2, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
rdblue added a commit to rdblue/iceberg that referenced this pull request Sep 2, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
rdblue added a commit to rdblue/iceberg that referenced this pull request Sep 2, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
rdblue added a commit that referenced this pull request Sep 3, 2022
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
sunchao pushed a commit to sunchao/iceberg that referenced this pull request May 9, 2023
sunchao pushed a commit to sunchao/iceberg that referenced this pull request May 10, 2023
* Core: Don't show dropped fields from the partition spec

* Use projection instead

* Use StructProjection in SparkDataFile.

Co-authored-by: Fokko Driesprong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants