-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exportFlattenedVector does not support nested encoding and non-scalar types #9821
Comments
@mbasmanova I think only when the vector is dictionary-encoded, we cannot create Parquet for tables with complex types. If not, they are supported as below in Bridge. velox/velox/vector/arrow/Bridge.cpp Lines 985 to 1001 in e2c0014
|
@rui-mo If Parquet writer can handle all types, but only flat encodings, then we can simply flatten data before writing to Parquet in the Fuzzer. |
@mbasmanova Got it. I will try as you suggested. Thanks. |
This issue could be resolved by flattening data before writing into Parquet. |
Description
In 9830814, flattenDictionary and flattenConstant are set as true for Parquet write, which relies on Bridge to convert Velox vector as Arrow array. When VectorFuzzer generates nested dictionary-encoded vector or non-scalar types, exporting to Arrow fails at below checks.
velox/velox/vector/arrow/Bridge.cpp
Lines 884 to 889 in dc561a3
The text was updated successfully, but these errors were encountered: