Skip to content

Commit

Permalink
Merge branch 'dev' into fix/connector-tdengine_source_errors
Browse files Browse the repository at this point in the history
  • Loading branch information
alextinng authored Apr 24, 2024
2 parents a7eb663 + e310353 commit eeeab0a
Show file tree
Hide file tree
Showing 828 changed files with 48,863 additions and 5,636 deletions.
34 changes: 32 additions & 2 deletions .github/workflows/backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ jobs:
current_branch='${{ steps.git_init.outputs.branch }}'
pip install GitPython
workspace="${GITHUB_WORKSPACE}"
repository_owner="${GITHUB_REPOSITORY_OWNER}"
cv2_files=`python tools/update_modules_check/check_file_updates.py ua $workspace apache/dev origin/$current_branch "seatunnel-connectors-v2/**"`
true_or_false=${cv2_files%%$'\n'*}
file_list=${cv2_files#*$'\n'}
Expand Down Expand Up @@ -133,6 +134,9 @@ jobs:
api_files=`python tools/update_modules_check/check_file_updates.py ua $workspace apache/dev origin/$current_branch "seatunnel-api/**" "seatunnel-common/**" "seatunnel-config/**" "seatunnel-connectors/**" "seatunnel-core/**" "seatunnel-e2e/seatunnel-e2e-common/**" "seatunnel-formats/**" "seatunnel-plugin-discovery/**" "seatunnel-transforms-v2/**" "seatunnel-translation/**" "seatunnel-e2e/seatunnel-transforms-v2-e2e/**" "seatunnel-connectors/**" "pom.xml" "**/workflows/**" "tools/**" "seatunnel-dist/**"`
true_or_false=${api_files%%$'\n'*}
file_list=${api_files#*$'\n'}
if [[ $repository_owner == 'apache' ]];then
true_or_false='true'
fi
echo "api=$true_or_false" >> $GITHUB_OUTPUT
echo "api_files=$file_list" >> $GITHUB_OUTPUT
Expand Down Expand Up @@ -304,6 +308,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-1)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -333,6 +339,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-2)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -362,6 +370,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-3)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -391,6 +401,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-4)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -419,6 +431,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-5)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -447,6 +461,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-6)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -475,6 +491,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-7)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -504,6 +522,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-8)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
Expand Down Expand Up @@ -536,7 +556,7 @@ jobs:
- name: run seatunnel zeta integration test
if: needs.changes.outputs.api == 'true'
run: |
./mvnw -T 1 -B verify -DskipUT=true -DskipIT=false -D"license.skipAddThirdParty"=true --no-snapshot-updates -pl :connector-seatunnel-e2e-base -am -Pci
./mvnw -T 1 -B verify -DskipUT=true -DskipIT=false -D"license.skipAddThirdParty"=true --no-snapshot-updates -pl :connector-seatunnel-e2e-base,:connector-console-seatunnel-e2e -am -Pci
env:
MAVEN_OPTS: -Xmx4096m
engine-k8s-it:
Expand All @@ -558,6 +578,8 @@ jobs:
env:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
- uses: actions/checkout@v2
- name: free disk space
run: tools/github/free_disk_space.sh
- name: Set up JDK ${{ matrix.java }}
uses: actions/setup-java@v3
with:
Expand Down Expand Up @@ -825,6 +847,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run jdbc connectors integration test (part-1)
if: needs.changes.outputs.api == 'true'
run: |
Expand All @@ -849,6 +873,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run jdbc connectors integration test (part-2)
if: needs.changes.outputs.api == 'true'
run: |
Expand Down Expand Up @@ -945,6 +971,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run jdbc connectors integration test (part-6)
if: needs.changes.outputs.api == 'true'
run: |
Expand All @@ -969,7 +997,7 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: run jdbc connectors integration test (part-6)
- name: run jdbc connectors integration test (part-7)
if: needs.changes.outputs.api == 'true'
run: |
./mvnw -B -T 1 verify -DskipUT=true -DskipIT=false -D"license.skipAddThirdParty"=true --no-snapshot-updates -pl :connector-jdbc-e2e-part-7 -am -Pci
Expand Down Expand Up @@ -1088,6 +1116,8 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- name: free disk space
run: tools/github/free_disk_space.sh
- name: run doris connector integration test
run: |
./mvnw -B -T 1 verify -DskipUT=true -DskipIT=false -D"license.skipAddThirdParty"=true --no-snapshot-updates -pl :connector-doris-e2e -am -Pci
Expand Down
Binary file added .idea/icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions bin/install-plugin.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ REM Get seatunnel home
set "SEATUNNEL_HOME=%~dp0..\"
echo Set SEATUNNEL_HOME to [%SEATUNNEL_HOME%]

REM Connector default version is 2.3.4, you can also choose a custom version. eg: 2.1.2: install-plugin.bat 2.1.2
set "version=2.3.4"
REM Connector default version is 2.3.5, you can also choose a custom version. eg: 2.1.2: install-plugin.bat 2.1.2
set "version=2.3.5"
if not "%~1"=="" set "version=%~1"

REM Create the lib directory
Expand Down
4 changes: 2 additions & 2 deletions bin/install-plugin.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@
# get seatunnel home
SEATUNNEL_HOME=$(cd $(dirname $0);cd ../;pwd)

# connector default version is 2.3.4, you can also choose a custom version. eg: 2.1.2: sh install-plugin.sh 2.1.2
version=2.3.4
# connector default version is 2.3.5, you can also choose a custom version. eg: 2.1.2: sh install-plugin.sh 2.1.2
version=2.3.5

if [ -n "$1" ]; then
version="$1"
Expand Down
2 changes: 1 addition & 1 deletion config/jvm_options
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,4 @@
-XX:MaxMetaspaceSize=2g

# G1GC
-XX:+UseG1GC
-XX:+UseG1GC
1 change: 1 addition & 0 deletions config/plugin_config
Original file line number Diff line number Diff line change
Expand Up @@ -76,4 +76,5 @@ connector-tablestore
connector-selectdb-cloud
connector-hbase
connector-amazonsqs
connector-easysearch
--end--
35 changes: 35 additions & 0 deletions docs/en/command/connector-check.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Connector check command usage

## Command Entrypoint

```shell
bin/seatunnel-connector.sh
```

## Options

```text
Usage: seatunnel-connector.sh [options]
Options:
-h, --help Show the usage message
-l, --list List all supported plugins(sources, sinks, transforms)
(default: false)
-o, --option-rule Get option rule of the plugin by the plugin
identifier(connector name or transform name)
-pt, --plugin-type SeaTunnel plugin type, support [source, sink,
transform]
```

## Example

```shell
# List all supported connectors(sources and sinks) and transforms
bin/seatunnel-connector.sh -l
# List all supported sinks
bin/seatunnel-connector.sh -l -pt sink
# Get option rule of the connector or transform by the name
bin/seatunnel-connector.sh -o Paimon
# Get option rule of paimon sink
bin/seatunnel-connector.sh -o Paimon -pt sink
```

6 changes: 5 additions & 1 deletion docs/en/concept/JobEnvConfig.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# JobEnvConfig
# Job Env Config

This document describes env configuration information, the common parameters can be used in all engines. In order to better distinguish between engine parameters, the additional parameters of other engine need to carry a prefix.
In flink engine, we use `flink.` as the prefix. In the spark engine, we do not use any prefixes to modify parameters, because the official spark parameters themselves start with `spark.`
Expand Down Expand Up @@ -29,6 +29,10 @@ In `STREAMING` mode, checkpoints is required, if you do not set it, it will be o

This parameter configures the parallelism of source and sink.

### job.retry.times

Used to control the default retry times when a job fails. The default value is 3, and it only works in the Zeta engine.

### shade.identifier

Specify the method of encryption, if you didn't have the requirement for encrypting or decrypting config files, this option can be ignored.
Expand Down
13 changes: 13 additions & 0 deletions docs/en/concept/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,19 @@ sink {
}
```

#### multi-line support

In `hocon`, multiline strings are supported, which allows you to include extended passages of text without worrying about newline characters or special formatting. This is achieved by enclosing the text within triple quotes **`"""`** . For example:

```
var = """
Apache SeaTunnel is a
next-generation high-performance,
distributed, massive data integration tool.
"""
sql = """ select * from "table" """
```

### json

```json
Expand Down
116 changes: 116 additions & 0 deletions docs/en/concept/event-listener.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Event Listener

## Introduction

The SeaTunnel provides a rich event listening feature that allows you to manage the status at which data is synchronized.
This functionality is crucial when you need to listen job running status(`org.apache.seatunnel.api.event`).
This document will guide you through the usage of these parameters and how to leverage them effectively.

## Support Those Engines

> SeaTunnel Zeta<br/>
> Flink<br/>
> Spark<br/>
## API

The event API is defined in the `org.apache.seatunnel.api.event` package.

### Event Data API

- `org.apache.seatunnel.api.event.Event` - The interface for event data.
- `org.apache.seatunnel.api.event.EventType` - The enum for event type.

### Event Listener API

You can customize event handler, such as sending events to external systems

- `org.apache.seatunnel.api.event.EventHandler` - The interface for event handler, SPI will automatically load subclass from the classpath.

### Event Collect API

- `org.apache.seatunnel.api.source.SourceSplitEnumerator` - Attached event listener API to report events from `SourceSplitEnumerator`.

```java
package org.apache.seatunnel.api.source;

public interface SourceSplitEnumerator {

interface Context {

/**
* Get the {@link org.apache.seatunnel.api.event.EventListener} of this enumerator.
*
* @return
*/
EventListener getEventListener();
}
}
```

- `org.apache.seatunnel.api.source.SourceReader` - Attached event listener API to report events from `SourceReader`.

```java
package org.apache.seatunnel.api.source;

public interface SourceReader {

interface Context {

/**
* Get the {@link org.apache.seatunnel.api.event.EventListener} of this reader.
*
* @return
*/
EventListener getEventListener();
}
}
```

- `org.apache.seatunnel.api.sink.SinkWriter` - Attached event listener API to report events from `SinkWriter`.

```java
package org.apache.seatunnel.api.sink;

public interface SinkWriter {

interface Context {

/**
* Get the {@link org.apache.seatunnel.api.event.EventListener} of this writer.
*
* @return
*/
EventListener getEventListener();
}
}
```

## Configuration Listener

To use the event listening feature, you need to configure engine config.

### Zeta Engine

Example config in your config file(seatunnel.yaml):

```
seatunnel:
engine:
event-report-http:
url: "http://example.com:1024/event/report"
headers:
Content-Type: application/json
```

### Flink Engine

You can define the implementation class of `org.apache.seatunnel.api.event.EventHandler` interface and add to the classpath to automatically load it through SPI.

Support flink version: 1.14.0+

Example: `org.apache.seatunnel.api.event.LoggingEventHandler`

### Spark Engine

You can define the implementation class of `org.apache.seatunnel.api.event.EventHandler` interface and add to the classpath to automatically load it through SPI.
1 change: 1 addition & 0 deletions docs/en/concept/schema-feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ columns = [
| type | Yes | - | The data type of the column |
| nullable | No | true | If the column can be nullable |
| columnLength | No | 0 | The length of the column which will be useful when you need to define the length |
| columnScale | No | - | The scale of the column which will be useful when you need to define the scale |
| defaultValue | No | null | The default value of the column |
| comment | No | null | The comment of the column |

Expand Down
4 changes: 2 additions & 2 deletions docs/en/connector-v2/formats/canal-json.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,10 @@ Canal provides a unified format for changelog, here is a simple example for an u
}
```

Note: please refer to Canal documentation about the meaning of each fields.
Note: please refer to [Canal documentation](https://github.com/alibaba/canal/wiki) about the meaning of each fields.

The MySQL products table has 4 columns (id, name, description and weight).
The above JSON message is an update change event on the products table where the weight value of the row with id = 111 is changed from 5.18 to 5.15.
The above JSON message is an update change event on the products table where the weight value of the row with id = 111 is changed from 5.15 to 5.18.
Assuming the messages have been synchronized to Kafka topic products_binlog, then we can use the following SeaTunnel to consume this topic and interpret the change events.

```bash
Expand Down
Loading

0 comments on commit eeeab0a

Please sign in to comment.