We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native worker not writing Parquet data files for WriterVersion v1 (PARQUET_1_0)
When set session hive.parquet_writer_version='PARQUET_1_0'; Parquet data should be written in format_version 1
set session hive.parquet_writer_version='PARQUET_1_0';
Even if when setting set session hive.parquet_writer_version='PARQUET_1_0'; Parquet data is written in format_version: 2.6
presto:reetika_testdb> set session hive.parquet_writer_version='PARQUET_1_0'; SET SESSION presto:reetika_testdb> create table hive.reetika_testdb.test_insert (id int) with (format = 'Parquet'); CREATE TABLE presto:reetika_testdb> insert into hive.reetika_testdb.test_insert values(1); INSERT: 1 row
Sample Output of Parquet File -
############ file meta data ############ created_by: parquet-cpp-velox num_columns: 1 num_rows: 1 num_row_groups: 1 format_version: 2.6 serialized_size: 146 ############ Columns ############ id ############ Column(id) ############ name: id path: id max_definition_level: 1 max_repetition_level: 0 physical_type: INT32 logical_type: None converted_type (legacy): NONE compression: GZIP (space_saved: -56%)
Looks like the session property for parquet_writer_version is not honored in Prestissimo. Same works fine with Jave Parquet Writer
parquet_writer_version
The text was updated successfully, but these errors were encountered:
Velox uses the Arrow Parquet Writer. I see that there is an option to specify V1 https://github.com/apache/arrow/blob/main/cpp/src/parquet/properties.h Let's add it to Velox. Can you point me to a test for V1 vs V2?
Sorry, something went wrong.
Fix in progress - facebookincubator/velox#9700
majetideepak
svm1
yingsu00
aditi-pandit
Successfully merging a pull request may close this issue.
Native worker not writing Parquet data files for WriterVersion v1 (PARQUET_1_0)
Your Environment
Expected Behavior
When
set session hive.parquet_writer_version='PARQUET_1_0';
Parquet data should be written in format_version 1
Current Behavior
Even if when setting
set session hive.parquet_writer_version='PARQUET_1_0';
Parquet data is written in format_version: 2.6Possible Solution
Steps to Reproduce
Sample Output of Parquet File -
Screenshots (if appropriate)
Context
Looks like the session property for
parquet_writer_version
is not honored in Prestissimo. Same works fine with Jave Parquet WriterThe text was updated successfully, but these errors were encountered: