Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-2478: Update README with link to parquet website #1355

Merged
merged 1 commit into from
May 22, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 19, 2024

You can see the rendered version of this README here:

https://github.com/alamb/parquet-mr/tree/alamb/website_link?tab=readme-ov-file#parquet-mr-

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason: No code is changed

Commits

There is a single commit with a self explanatory description

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines
    from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Style

There are no code changes

  • My contribution adheres to the code style guidelines and Spotless passes.
    • To apply the necessary changes, run mvn spotless:apply -Pvector-plugins

Documentation

This PR has no java code changes, only markdown

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

Parquet uses the [record shredding and assembly algorithm](https://github.com/julienledem/redelm/wiki/The-striping-and-assembly-algorithms-from-the-Dremel-paper) described in the Dremel paper to represent nested structures.
This repository contains a Java implementation of [Apache Parquet](https://parquet.apache.org/)

Apache Parquet is an open source, column-oriented data file format
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same wording from apache/parquet-site#59

Update the introductory content to reduce confusion about parquet in general.
@alamb alamb force-pushed the alamb/website_link branch from a1b68d4 to 9412061 Compare May 19, 2024 10:31
Parquet-MR contains the java implementation of the [Parquet format](https://github.com/apache/parquet-format).
Parquet is a columnar storage format for Hadoop; it provides efficient storage and encoding of data.
Parquet uses the [record shredding and assembly algorithm](https://github.com/julienledem/redelm/wiki/The-striping-and-assembly-algorithms-from-the-Dremel-paper) described in the Dremel paper to represent nested structures.
This repository contains a Java implementation of [Apache Parquet](https://parquet.apache.org/)
Copy link
Contributor

@vinooganesh vinooganesh May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "a" makes more sense here too, especially given the discussion around what constitutes a reference implementation

Copy link
Contributor

@vinooganesh vinooganesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Member

@julienledem julienledem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

@wgtmac wgtmac merged commit 6809a18 into apache:master May 22, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants