Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Connector-V2][Doris] Add Doris StreamLoad sink connector #3631

Merged
merged 53 commits into from
Dec 14, 2022
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
7dfea77
[Feature][Connector-V2][Doris] Add Doris StreamLoad sink connector
TaoZex Dec 2, 2022
d8fe461
[Feature][Connector-V2][Doris] Update code style
TaoZex Dec 2, 2022
4050d53
[Feature][Connector-V2][Doris] Use seatunnel format to serilize
TaoZex Dec 2, 2022
9a08512
[Feature][Connector-V2][Doris] Add delimiterParse util
TaoZex Dec 2, 2022
454f1b5
[Feature][Connector-V2][Doris] Unify exception
TaoZex Dec 2, 2022
0094168
[Feature][Connector-V2][Doris] Update error code
TaoZex Dec 2, 2022
45d8760
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 2, 2022
230ebfc
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 3, 2022
cd41985
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 4, 2022
422519f
[Feature][Connector-V2][Doris] Update e2e code
TaoZex Dec 4, 2022
ecf85d7
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 4, 2022
cccf497
[Feature][Connector-V2][Doris] Use String format
TaoZex Dec 4, 2022
afee218
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 5, 2022
60a1150
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
ed55d36
[Feature][Connector-V2][Doris] update code
TaoZex Dec 5, 2022
c7f29cd
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
f613bc3
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
54febcf
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 5, 2022
9e11ab9
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 5, 2022
a5dcd93
Update README.md
TaoZex Dec 5, 2022
ef3376b
Update quick-start-flink.md
TaoZex Dec 5, 2022
c3b82d0
Update quick-start-seatunnel-engine.md
TaoZex Dec 5, 2022
76b12cf
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
9e17c78
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
02268c4
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
5275611
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
e78d309
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 5, 2022
7208e29
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 5, 2022
752e280
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 6, 2022
b489b7e
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 6, 2022
2df8265
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 6, 2022
b1986a0
[Feature][Connector-V2][Doris] Update doc
TaoZex Dec 6, 2022
9f02274
Merge branch 'doris_streamload_sink' of https://github.com/TaoZex/inc…
TaoZex Dec 6, 2022
c6a58bc
Update DorisStreamLoadVisitor.java
TaoZex Dec 6, 2022
1c5c0a2
Update pom.xml
TaoZex Dec 6, 2022
c7715b4
Update DorisIT.java
TaoZex Dec 6, 2022
38831e7
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 6, 2022
712a021
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 7, 2022
d82604f
[Feature][Connector-V2][Doris] Update use ReadonlyConfig
TaoZex Dec 7, 2022
0a9b208
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 7, 2022
50db8ef
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 7, 2022
79e98f8
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 7, 2022
8d56dab
[Feature][Connector-V2][Doris] fix ci
TaoZex Dec 7, 2022
fd48d60
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 7, 2022
80db7aa
Merge branch 'doris_streamload_sink' of https://github.com/TaoZex/inc…
TaoZex Dec 7, 2022
e76720e
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 7, 2022
1001d47
Update DorisIT.java
TaoZex Dec 8, 2022
7dfa10c
[Feature][Connector-V2][Doris] Update e2e test
TaoZex Dec 8, 2022
2be7506
[Feature][Connector-V2][Doris] Update e2e test
TaoZex Dec 8, 2022
c5046e7
[Feature][Connector-V2][Doris] Update code
TaoZex Dec 8, 2022
6b7ae70
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 8, 2022
4894727
Merge branch 'dev' into doris_streamload_sink
TaoZex Dec 11, 2022
1b3d598
Merge branch 'apache:dev' into doris_streamload_sink
TaoZex Dec 13, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/en/connector-v2/Error-Quick-Reference-Manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ problems encountered by users.
| HUDI-01 | Create ParquetMetadata failed | When the user encounters this error code, it indicates that ParquetMetadata creation failed. Please check |
| HUDI-02 | Kerberos Authorized failed | When the user encounters this error code, it indicates that Kerberos authorization failed. Please check |

## Doris Connector Error Codes

| code | description | solution |
|----------|-------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------|
| Doris-01 | Writing records to Doris failed. | When users encounter this error code, it means that writing records to Doris failed, please check data from files whether is correct |

## Clickhouse Connector Error Codes

| code | description | solution |
Expand Down
130 changes: 130 additions & 0 deletions docs/en/connector-v2/sink/Doris.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Doris

> Doris sink connector

## Description
Used to send data to Doris. Both support streaming and batch mode.
The internal implementation of Doris sink connector is cached and imported by stream load in batches.
## Key features

- [ ] [exactly-once](../../concept/connector-v2-features.md)
- [ ] [schema projection](../../concept/connector-v2-features.md)

## Options

| name | type | required | default value |
|-----------------------------|------------------------------|----------|-----------------|
| node_urls | list | yes | - |
| username | string | yes | - |
| password | string | yes | - |
| database | string | yes | - |
| table | string | yes | - |
| labelPrefix | string | no | - |
| batch_max_rows | long | no | 1024 |
| batch_max_bytes | int | no | 5 * 1024 * 1024 |
| batch_interval_ms | int | no | - |
| max_retries | int | no | - |
| retry_backoff_multiplier_ms | int | no | - |
| max_retry_backoff_ms | int | no | - |
| sink.properties.* | doris stream load config | no | - |

### node_urls [list]

`Doris` cluster address, the format is `["fe_ip:fe_http_port", ...]`

### username [string]

`Doris` user username

### password [string]

`Doris` user password

### database [string]

The name of `Doris` database

### table [string]

The name of `Doris` table

### labelPrefix [string]

The prefix of `Doris` stream load label

### batch_max_rows [long]

For batch writing, when the number of buffers reaches the number of `batch_max_rows` or the byte size of `batch_max_bytes` or the time reaches `batch_interval_ms`, the data will be flushed into the Doris

### batch_max_bytes [int]

For batch writing, when the number of buffers reaches the number of `batch_max_rows` or the byte size of `batch_max_bytes` or the time reaches `batch_interval_ms`, the data will be flushed into the Doris

### batch_interval_ms [int]

For batch writing, when the number of buffers reaches the number of `batch_max_rows` or the byte size of `batch_max_bytes` or the time reaches `batch_interval_ms`, the data will be flushed into the Doris

### max_retries [int]

The number of retries to flush failed

### retry_backoff_multiplier_ms [int]

Using as a multiplier for generating the next delay for backoff

### max_retry_backoff_ms [int]

The amount of time to wait before attempting to retry a request to `Doris`

### sink.properties.* [doris stream load config]

The parameter of the stream load `data_desc`
The way to specify the parameter is to add the prefix `sink.properties.` to the original stream load parameter name
For example, the way to specify `strip_outer_array` is: `sink.properties.strip_outer_array`

#### Supported import data formats

The supported formats include CSV and JSON. Default value: CSV

## Example

Use JSON format to import data

```
sink {
Doris {
nodeUrls = ["e2e_dorisdb:8030"]
username = root
password = ""
database = "test"
table = "e2e_table_sink"
batch_max_rows = 100
sink.properties.format = "JSON"
sink.properties.strip_outer_array = true
}
}

```

Use CSV format to import data

```
sink {
Doris {
nodeUrls = ["e2e_dorisdb:8030"]
username = root
password = ""
database = "test"
table = "e2e_table_sink"
batch_max_rows = 100
sink.properties.format = "CSV"
sink.properties.column_separator = ","
}
}
```

## Changelog

### next version

- Add Doris Sink Connector
3 changes: 2 additions & 1 deletion plugin-mapping.properties
Original file line number Diff line number Diff line change
Expand Up @@ -156,4 +156,5 @@ seatunnel.source.Jira = connector-http-jira
seatunnel.source.Gitlab = connector-http-gitlab
seatunnel.sink.RabbitMQ = connector-rabbitmq
seatunnel.source.RabbitMQ = connector-rabbitmq
seatunnel.source.OpenMldb = connector-openmldb
seatunnel.source.OpenMldb = connector-openmldb
seatunnel.sink.Doris = connector-doris
68 changes: 68 additions & 0 deletions seatunnel-connectors-v2/connector-doris/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--

Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<parent>
<artifactId>seatunnel-connectors-v2</artifactId>
<groupId>org.apache.seatunnel</groupId>
<version>${revision}</version>
</parent>
<modelVersion>4.0.0</modelVersion>

<artifactId>connector-doris</artifactId>

<properties>
<httpclient.version>4.5.13</httpclient.version>
<httpcore.version>4.4.4</httpcore.version>
</properties>

<dependencies>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>seatunnel-api</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>connector-common</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>${httpclient.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>${httpcore.version}</version>
</dependency>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>seatunnel-format-json</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.seatunnel</groupId>
<artifactId>seatunnel-format-text</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.seatunnel.connectors.doris.client;

import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.Setter;

import java.util.List;

@AllArgsConstructor
@Getter
@Setter
public class DorisFlushTuple {
private String label;
private Long bytes;
private List<byte[]> rows;
}
Loading