Fixes #3589: Add support for Parquet files similar to CSV/Arrow #3711
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #3589
This PR adds the support for Apache Parquet export/import/load
Added 4 export procedures that streams a list of byte[] one per each batch:
apoc.export.parquet.all.stream
,apoc.export.parquet.graph.stream
,apoc.export.parquet.query.stream
,apoc.export.parquet.data.stream
Added 4 export procedures which create a Parquet file and return a
ProgressInfo
result, like the CSV ones:apoc.export.parquet.all
,apoc.export.parquet.graph
,apoc.export.parquet.query
,apoc.export.parquet.data
Added one load procedure
apoc.load.parquet
that reads a Parquet byte[] or a Parquet file and returns a map for each rowAdded one import procedure
apoc.import.parquet
that import data from a Parquet byte[] or a Parquet fileIn order to load/import complex data not recognized by parquet, like Duration, Point, List of Duration, etc... , which will be stringified during export,
we can use the
mapping: {keyToConvert: valueTypeName}
config to convert them.For example
apoc.import.parquet(fileName, {mapping: {foo: "DurationArray"}})
in order to convert a keyfoo
to a List of DurationCreated a follow-up card to create doc files and any other additions