Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve][File-Connector] Support write fixed/timestamp as int96 of parquet #6971

Merged
merged 1 commit into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 39 additions & 29 deletions docs/en/connector-v2/sink/CosFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,35 +34,37 @@ By default, we use 2PC commit to ensure `exactly-once`

## Options

| Name | Type | Required | Default | Description |
|----------------------------------|---------|----------|--------------------------------------------|-------------------------------------------------------------------------------------------------------------------|
| path | string | yes | - | |
| tmp_path | string | no | /tmp/seatunnel | The result file will write to a tmp path first and then use `mv` to submit tmp dir to target dir. Need a COS dir. |
| bucket | string | yes | - | |
| secret_id | string | yes | - | |
| secret_key | string | yes | - | |
| region | string | yes | - | |
| custom_filename | boolean | no | false | Whether you need custom the filename |
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
| is_partition_field_write_in_file | boolean | no | false | Only used then have_partition is true |
| sink_columns | array | no | | When this parameter is empty, all fields are sink columns |
| is_enable_transaction | boolean | no | true | |
| batch_size | int | no | 1000000 | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
| encoding | string | no | "UTF-8" | Only used when file_format_type is json,text,csv,xml. |
| Name | Type | Required | Default | Description |
|---------------------------------------|---------|----------|--------------------------------------------|-------------------------------------------------------------------------------------------------------------------|
| path | string | yes | - | |
| tmp_path | string | no | /tmp/seatunnel | The result file will write to a tmp path first and then use `mv` to submit tmp dir to target dir. Need a COS dir. |
| bucket | string | yes | - | |
| secret_id | string | yes | - | |
| secret_key | string | yes | - | |
| region | string | yes | - | |
| custom_filename | boolean | no | false | Whether you need custom the filename |
| file_name_expression | string | no | "${transactionId}" | Only used when custom_filename is true |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when custom_filename is true |
| file_format_type | string | no | "csv" | |
| field_delimiter | string | no | '\001' | Only used when file_format is text |
| row_delimiter | string | no | "\n" | Only used when file_format is text |
| have_partition | boolean | no | false | Whether you need processing partitions. |
| partition_by | array | no | - | Only used then have_partition is true |
| partition_dir_expression | string | no | "${k0}=${v0}/${k1}=${v1}/.../${kn}=${vn}/" | Only used then have_partition is true |
| is_partition_field_write_in_file | boolean | no | false | Only used then have_partition is true |
| sink_columns | array | no | | When this parameter is empty, all fields are sink columns |
| is_enable_transaction | boolean | no | true | |
| batch_size | int | no | 1000000 | |
| compress_codec | string | no | none | |
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
| parquet_avro_write_timestamp_as_int96 | boolean | no | false | Only used when file_format is parquet. |
| parquet_avro_write_fixed_as_int96 | array | no | - | Only used when file_format is parquet. |
| encoding | string | no | "UTF-8" | Only used when file_format_type is json,text,csv,xml. |

### path [string]

Expand Down Expand Up @@ -207,6 +209,14 @@ Specifies the tag name of the data rows within the XML file.

Specifies Whether to process data using the tag attribute format.

### parquet_avro_write_timestamp_as_int96 [boolean]

Support writing Parquet INT96 from a timestamp, only valid for parquet files.

### parquet_avro_write_fixed_as_int96 [array]

Support writing Parquet INT96 from a 12-byte field, only valid for parquet files.

### encoding [string]

Only used when file_format_type is json,text,csv,xml.
Expand Down
Loading
Loading