-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Parquet files similar to CSV/Arrow #3589
Comments
Here's some inspiration michael-simons/neo4j-load-parquet@0d87e79 Feel free to copy what you need. I used the least invasive library I could find for reading Parquet in Java. If you use the default Avro, you get Hadoop, Spark, a banana, the monkey holding the banana and a bit of the jungle where it lives… Ping me if you have questions, @conker84 this has been done for a PoC how fast we can get data into the database. |
@michael-simons thank you so much! |
Sorry, had forgotten to create the issue for this.
Parquet files also provide already schema information, which should help with smoothing the import.
And we should add support for pushdown predicates and column selection to minimize the amount of data that is loaded from the parquet infrastructure.
The text was updated successfully, but these errors were encountered: