diff --git a/docs/en/other-engine/flink.md b/docs/en/other-engine/flink.md
index 567bfb7ca10..8a77fbfc241 100644
--- a/docs/en/other-engine/flink.md
+++ b/docs/en/other-engine/flink.md
@@ -1,8 +1,8 @@
-# Seatunnel runs on Flink
+# Seatunnel Runs On Flink
-Flink is a powerful high-performance distributed stream processing engine,More information about it you can,You can search for `Apache Flink`
+Flink is a powerful high-performance distributed stream processing engine. More information about it you can search for `Apache Flink`
-### Set Flink configuration information in the job
+### Set Flink Configuration Information In The Job
Begin with `flink.`
@@ -19,9 +19,9 @@ env {
Enumeration types are not currently supported, you need to specify them in the Flink conf file ,Only these types of Settings are supported for the time being:
Integer/Boolean/String/Duration
-### How to set up a simple Flink job
+### How To Set Up A Simple Flink Job
-This is a simple job that runs on Flink Randomly generated data is printed to the console
+This is a simple job that runs on Flink. Randomly generated data is printed to the console
```
env {
@@ -79,6 +79,6 @@ sink{
}
```
-### How to run a job in a project
+### How To Run A Job In A Project
-After you pull the code to the local, go to the `seatunnel-examples/seatunnel-flink-connector-v2-example` module find `org.apache.seatunnel.example.flink.v2.SeaTunnelApiExample` To complete the operation of the job
+After you pull the code to the local, go to the `seatunnel-examples/seatunnel-flink-connector-v2-example` module and find `org.apache.seatunnel.example.flink.v2.SeaTunnelApiExample` to complete the operation of the job.
diff --git a/docs/en/seatunnel-engine/about.md b/docs/en/seatunnel-engine/about.md
index 409befb5f55..da78035c8b4 100644
--- a/docs/en/seatunnel-engine/about.md
+++ b/docs/en/seatunnel-engine/about.md
@@ -18,21 +18,21 @@ In the future, SeaTunnel Engine will further optimize its functions to support f
### Cluster Management
-- Support stand-alone operation;
+- Support standalone operation;
- Support cluster operation;
- Support autonomous cluster (decentralized), which saves the users from specifying a master node for the SeaTunnel Engine cluster, because it can select a master node by itself during operation, and a new master node will be chosen automatically when the master node fails.
- Autonomous Cluster nodes-discovery and nodes with the same cluster_name will automatically form a cluster.
### Core functions
-- Supports running jobs in local mode, and the cluster is automatically destroyed after the job once completed;
-- Supports running jobs in Cluster mode (single machine or cluster), submitting jobs to the SeaTunnel Engine service through the SeaTunnel Client, and the service continues to run after the job is completed and waits for the next job submission;
+- Support running jobs in local mode, and the cluster is automatically destroyed after the job once completed;
+- Support running jobs in cluster mode (single machine or cluster), submitting jobs to the SeaTunnel Engine service through the SeaTunnel client, and the service continues to run after the job is completed and waits for the next job submission;
- Support offline batch synchronization;
- Support real-time synchronization;
- Batch-stream integration, all SeaTunnel V2 connectors can run in SeaTunnel Engine;
-- Supports distributed snapshot algorithm, and supports two-stage submission with SeaTunnel V2 connector, ensuring that data is executed only once.
-- Support job invocation at the Pipeline level to ensure that it can be started even when resources are limited;
-- Supports fault tolerance for jobs at the Pipeline level. Task failure only affects the Pipeline where it is located, and only the task under the Pipeline needs to be rolled back;
+- Support distributed snapshot algorithm, and supports two-stage submission with SeaTunnel V2 connector, ensuring that data is executed only once.
+- Support job invocation at the pipeline level to ensure that it can be started even when resources are limited;
+- Support fault tolerance for jobs at the Pipeline level. Task failure only affects the pipeline where it is located, and only the task under the Pipeline needs to be rolled back;
- Support dynamic thread sharing to synchronize a large number of small data sets in real-time.
### Quick Start
diff --git a/docs/en/seatunnel-engine/checkpoint-storage.md b/docs/en/seatunnel-engine/checkpoint-storage.md
index 13e1721371c..52af8c4af27 100644
--- a/docs/en/seatunnel-engine/checkpoint-storage.md
+++ b/docs/en/seatunnel-engine/checkpoint-storage.md
@@ -18,11 +18,11 @@ SeaTunnel Engine supports the following checkpoint storage types:
- HDFS (OSS,S3,HDFS,LocalFile)
- LocalFile (native), (it's deprecated: use Hdfs(LocalFile) instead.
-We used the microkernel design pattern to separate the checkpoint storage module from the engine. This allows users to implement their own checkpoint storage modules.
+We use the microkernel design pattern to separate the checkpoint storage module from the engine. This allows users to implement their own checkpoint storage modules.
`checkpoint-storage-api` is the checkpoint storage module API, which defines the interface of the checkpoint storage module.
-if you want to implement your own checkpoint storage module, you need to implement the `CheckpointStorage` and provide the corresponding `CheckpointStorageFactory` implementation.
+If you want to implement your own checkpoint storage module, you need to implement the `CheckpointStorage` and provide the corresponding `CheckpointStorageFactory` implementation.
### Checkpoint Storage Configuration
@@ -46,12 +46,12 @@ Notice: namespace must end with "/".
#### OSS
-Aliyun oss base on hdfs-file, so you can refer [hadoop oss docs](https://hadoop.apache.org/docs/stable/hadoop-aliyun/tools/hadoop-aliyun/index.html) to config oss.
+Aliyun OSS based hdfs-file you can refer [Hadoop OSS Docs](https://hadoop.apache.org/docs/stable/hadoop-aliyun/tools/hadoop-aliyun/index.html) to config oss.
Except when interacting with oss buckets, the oss client needs the credentials needed to interact with buckets.
The client supports multiple authentication mechanisms and can be configured as to which mechanisms to use, and their order of use. Custom implementations of org.apache.hadoop.fs.aliyun.oss.AliyunCredentialsProvider may also be used.
-if you used AliyunCredentialsProvider (can be obtained from the Aliyun Access Key Management), these consist of an access key, a secret key.
-you can config like this:
+If you used AliyunCredentialsProvider (can be obtained from the Aliyun Access Key Management), these consist of an access key, a secret key.
+You can config like this:
```yaml
seatunnel:
@@ -71,18 +71,18 @@ seatunnel:
fs.oss.credentials.provider: org.apache.hadoop.fs.aliyun.oss.AliyunCredentialsProvider
```
-For additional reading on the Hadoop Credential Provider API see: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
+For additional reading on the Hadoop Credential Provider API, you can see: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
-Aliyun oss Credential Provider implements see: [Auth Credential Providers](https://github.com/aliyun/aliyun-oss-java-sdk/tree/master/src/main/java/com/aliyun/oss/common/auth)
+For Aliyun OSS Credential Provider implements, you can see: [Auth Credential Providers](https://github.com/aliyun/aliyun-oss-java-sdk/tree/master/src/main/java/com/aliyun/oss/common/auth)
#### S3
-S3 base on hdfs-file, so you can refer [hadoop s3 docs](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html) to config s3.
+S3 based hdfs-file you can refer [hadoop s3 docs](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html) to config s3.
Except when interacting with public S3 buckets, the S3A client needs the credentials needed to interact with buckets.
The client supports multiple authentication mechanisms and can be configured as to which mechanisms to use, and their order of use. Custom implementations of com.amazonaws.auth.AWSCredentialsProvider may also be used.
-if you used SimpleAWSCredentialsProvider (can be obtained from the Amazon Security Token Service), these consist of an access key, a secret key.
-you can config like this:
+If you used SimpleAWSCredentialsProvider (can be obtained from the Amazon Security Token Service), these consist of an access key, a secret key.
+You can config like this:
```yaml
@@ -104,8 +104,8 @@ seatunnel:
```
-if you used `InstanceProfileCredentialsProvider`, this supports use of instance profile credentials if running in an EC2 VM, you could check [iam-roles-for-amazon-ec2](https://docs.aws.amazon.com/zh_cn/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html).
-you can config like this:
+If you used `InstanceProfileCredentialsProvider`, which supports use of instance profile credentials if running in an EC2 VM, you can check [iam-roles-for-amazon-ec2](https://docs.aws.amazon.com/zh_cn/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html).
+You can config like this:
```yaml
@@ -146,11 +146,11 @@ seatunnel:
# important: The user of this key needs to have write permission for the bucket, otherwise an exception of 403 will be returned
```
-For additional reading on the Hadoop Credential Provider API see: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
+For additional reading on the Hadoop Credential Provider API, you can see: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
#### HDFS
-if you used HDFS, you can config like this:
+if you use HDFS, you can config like this:
```yaml
seatunnel:
diff --git a/docs/en/seatunnel-engine/deployment.md b/docs/en/seatunnel-engine/deployment.md
index 7b7650df1f2..a708091e32e 100644
--- a/docs/en/seatunnel-engine/deployment.md
+++ b/docs/en/seatunnel-engine/deployment.md
@@ -7,7 +7,7 @@ sidebar_position: 3
SeaTunnel Engine(Zeta) supports three different deployment modes: local mode, hybrid cluster mode, and separated cluster mode.
-Each deployment mode has different usage scenarios, advantages, and disadvantages. When choosing a deployment mode, you should choose according to your needs and environment.
+Each deployment mode has different usage scenarios, advantages, and disadvantages. You should choose a deployment mode according to your needs and environment.
**Local mode:** Only used for testing, each task will start an independent process, and the process will exit after the task is completed.
@@ -15,10 +15,10 @@ Each deployment mode has different usage scenarios, advantages, and disadvantage
**Separated cluster mode(experimental feature):** The Master service and Worker service of SeaTunnel Engine are separated, and each service is a single process. The Master node is only responsible for job scheduling, rest api, task submission, etc., and Imap data is only stored in the Master node. The Worker node is only responsible for the execution of the task, does not participate in the election to become the master, and does not store Imap data.
-**Usage suggestion:** Although [separated cluster mode](separated-cluster-deployment.md) is an experimental feature, the first recommended usage will be made in the future. In the hybrid cluster mode, the Master node needs to run tasks synchronously. When the task scale is large, it will affect the stability of the Master node. Once the Master node crashes or the heartbeat times out, it will lead to the switch of the Master node, and the switch of the Master node will cause fault tolerance of all running tasks, which will further increase the load of the cluster. Therefore, we recommend using the separated mode more.
+**Usage suggestion:** Although [Separated Cluster Mode](separated-cluster-deployment.md) is an experimental feature, the first recommended usage will be made in the future. In the hybrid cluster mode, the Master node needs to run tasks synchronously. When the task scale is large, it will affect the stability of the Master node. Once the Master node crashes or the heartbeat times out, it will lead to the switch of the Master node, and the switch of the Master node will cause fault tolerance of all running tasks, which will further increase the load of the cluster. Therefore, we recommend using the separated mode more.
-[Local mode deployment](local-mode-deployment.md)
+[Local Mode Deployment](local-mode-deployment.md)
-[Hybrid cluster mode deployment](hybrid-cluster-deployment.md)
+[Hybrid Cluster Mode Deployment](hybrid-cluster-deployment.md)
-[Separated cluster mode deployment](separated-cluster-deployment.md)
+[Separated Cluster Mode Deployment](separated-cluster-deployment.md)
diff --git a/docs/en/seatunnel-engine/download-seatunnel.md b/docs/en/seatunnel-engine/download-seatunnel.md
index 138d685fe47..ffbf833820a 100644
--- a/docs/en/seatunnel-engine/download-seatunnel.md
+++ b/docs/en/seatunnel-engine/download-seatunnel.md
@@ -6,7 +6,7 @@ sidebar_position: 2
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
-# Download and Make Installation Packages
+# Download And Make Installation Packages
## Step 1: Preparation
@@ -16,7 +16,7 @@ Before starting to download SeaTunnel, you need to ensure that you have installe
## Step 2: Download SeaTunnel
-Go to the [seatunnel download page](https://seatunnel.apache.org/download) to download the latest version of the release version installation package `seatunnel--bin.tar.gz`.
+Go to the [Seatunnel Download Page](https://seatunnel.apache.org/download) to download the latest version of the release version installation package `seatunnel--bin.tar.gz`.
Or you can also download it through the terminal.
@@ -26,12 +26,12 @@ wget "https://archive.apache.org/dist/seatunnel/${version}/apache-seatunnel-${ve
tar -xzvf "apache-seatunnel-${version}-bin.tar.gz"
```
-## Step 3: Download the connector plug-in
+## Step 3: Download The Connector Plugin
Starting from the 2.2.0-beta version, the binary package no longer provides the connector dependency by default. Therefore, when using it for the first time, you need to execute the following command to install the connector: (Of course, you can also manually download the connector from the [Apache Maven Repository](https://repo.maven.apache.org/maven2/org/apache/seatunnel/), and then move it to the `connectors/seatunnel` directory).
```bash
-sh bin/install-plugin.sh 2.3.6
+sh bin/install-plugin.sh
```
If you need a specific connector version, taking 2.3.6 as an example, you need to execute the following command.
@@ -65,6 +65,6 @@ If you want to install connector plugins by manually downloading connectors, you
:::
-Now you have completed the download of the SeaTunnel installation package and the download of the connector plug-in. Next, you can choose different running modes according to your needs to run or deploy SeaTunnel.
+Now you have completed the download of the SeaTunnel installation package and the download of the connector plugin. Next, you can choose different running modes according to your needs to run or deploy SeaTunnel.
-If you use the SeaTunnel Engine (Zeta) that comes with SeaTunnel to run tasks, you need to deploy the SeaTunnel Engine service first. Refer to [Deployment of SeaTunnel Engine (Zeta) Service](deployment.md).
+If you use the SeaTunnel Engine (Zeta) that comes with SeaTunnel to run tasks, you need to deploy the SeaTunnel Engine service first. Refer to [Deployment Of SeaTunnel Engine (Zeta) Service](deployment.md).
diff --git a/docs/en/seatunnel-engine/engine-jar-storage-mode.md b/docs/en/seatunnel-engine/engine-jar-storage-mode.md
index a9d14483b0d..2dd68164816 100644
--- a/docs/en/seatunnel-engine/engine-jar-storage-mode.md
+++ b/docs/en/seatunnel-engine/engine-jar-storage-mode.md
@@ -13,42 +13,42 @@ We are committed to ongoing efforts to enhance and stabilize this functionality,
:::
We can enable the optimization job submission process, which is configured in the `seatunel.yaml`. After enabling the optimization of the Seatunnel job submission process configuration item,
-users can use the Seatunnel Zeta engine as the execution engine without placing the connector Jar packages required for task execution or the third-party Jar packages that the connector relies on in each engine `connector` directory.
-Users only need to place all the Jar packages for task execution on the client that submits the job, and the client will automatically upload the Jars required for task execution to the Zeta engine. It is necessary to enable this configuration item when submitting jobs in Docker or k8s mode,
+users can use the Seatunnel engine(Zeta) as the execution engine without placing the connector jar packages required for task execution or the third-party jar packages that the connector relies on in each engine `connector` directory.
+Users only need to place all the jar packages for task execution on the client that submits the job, and the client will automatically upload the jars required for task execution to the Zeta engine. It is necessary to enable this configuration item when submitting jobs in Docker or k8s mode,
which can fundamentally solve the problem of large container images caused by the heavy weight of the Seatunnel Zeta engine. In the image, only the core framework package of the Zeta engine needs to be provided,
and then the jar package of the connector and the third-party jar package that the connector relies on can be separately uploaded to the pod for distribution.
-After enabling the optimization job submission process configuration item, you do not need to place the following two types of Jar packages in the Zeta engine:
+After enabling the optimization job submission process configuration item, you do not need to place the following two types of jar packages in the Zeta engine:
- COMMON_PLUGIN_JARS
- CONNECTOR_PLUGIN_JARS
-COMMON_ PLUGIN_ JARS refers to the third-party Jar package that the connector relies on, CONNECTOR_ PLUGIN_ JARS refers to the connector Jar package.
+COMMON_ PLUGIN_ JARS refers to the third-party jar package that the connector relies on, CONNECTOR_ PLUGIN_ JARS refers to the connector jar package.
When common jars do not exist in Zeta's `lib`, it can upload the local common jars of the client to the `lib` directory of all engine nodes.
This way, even if the user does not place a jar on all nodes in Zeta's `lib`, the task can still be executed normally.
-However, we do not recommend relying on the configuration item of opening the optimization job submission process to upload the third-party Jar package that the connector relies on.
+However, we do not recommend relying on the configuration item of opening the optimization job submission process to upload the third-party jar package that the connector relies on.
If you use Zeta Engine, please add the third-party jar package files that the connector relies on to `$SEATUNNEL_HOME/lib/` directory on each node, such as jdbc drivers.
-# ConnectorJar storage strategy
+# ConnectorJar Storage Strategy
-You can configure the storage strategy of the current connector Jar package and the third-party Jar package that the connector depends on through the configuration file.
-There are two storage strategies that can be configured, namely shared Jar package storage strategy and isolated Jar package storage strategy.
-Two different storage strategies provide a more flexible storage mode for Jar files. You can configure the storage strategy to share the same Jar package file with multiple execution jobs in the engine.
+You can configure the storage strategy of the current connector jar package and the third-party jar package that the connector depends on through the configuration file.
+There are two storage strategies that can be configured, namely shared jar package storage strategy and isolated jar package storage strategy.
+Two different storage strategies provide a more flexible storage mode for jar files. You can configure the storage strategy to share the same jar package file with multiple execution jobs in the engine.
-## Related configuration
+## Related Configuration
-| parameter | default value | describe |
+| Parameter | Default Value | Describe |
|-------------------------------------|---------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
-| connector-jar-storage-enable | false | Whether to enable uploading the connector Jar package to the engine. The default enabled state is false. |
-| connector-jar-storage-mode | SHARED | Engine-side Jar package storage mode selection. There are two optional modes, SHARED and ISOLATED. The default Jar package storage mode is SHARED. |
-| connector-jar-storage-path | " " | User-defined Jar package storage path. |
-| connector-jar-cleanup-task-interval | 3600s | Engine-side Jar package cleaning scheduled task execution interval. |
-| connector-jar-expiry-time | 600s | Engine-side Jar package storage expiration time. |
+| connector-jar-storage-enable | false | Whether to enable uploading the connector jar package to the engine. The default enabled state is false. |
+| connector-jar-storage-mode | SHARED | Engine-side jar package storage mode selection. There are two optional modes, SHARED and ISOLATED. The default Jar package storage mode is SHARED. |
+| connector-jar-storage-path | " " | User-defined jar package storage path. |
+| connector-jar-cleanup-task-interval | 3600s | Engine-side jar package cleaning scheduled task execution interval. |
+| connector-jar-expiry-time | 600s | Engine-side jar package storage expiration time. |
## IsolatedConnectorJarStorageStrategy
-Before the job is submitted, the connector Jar package will be uploaded to an independent file storage path on the Master node.
-The connector Jar packages of different jobs are in different storage paths, so the connector Jar packages of different jobs are isolated from each other.
-The Jar package files required for the execution of a job have no influence on other jobs. When the current job execution ends, the Jar package file in the storage path generated based on the JobId will be deleted.
+Before the job is submitted, the connector Jjr package will be uploaded to an independent file storage path on the Master node.
+The connector jar packages of different jobs are in different storage paths, so the connector jar packages of different jobs are isolated from each other.
+The jar package files required for the execution of a job have no influence on other jobs. When the current job execution ends, the jar package file in the storage path generated based on the JobId will be deleted.
Example:
@@ -62,18 +62,18 @@ jar-storage:
```
Detailed explanation of configuration parameters:
-- connector-jar-storage-enable: Enable uploading the connector Jar package before executing the job.
-- connector-jar-storage-mode: Connector Jar package storage mode, two storage modes are available: shared mode (SHARED) and isolation mode (ISOLATED).
-- connector-jar-storage-path: The local storage path of the user-defined connector Jar package on the Zeta engine.
-- connector-jar-cleanup-task-interval: Zeta engine connector Jar package scheduled cleanup task interval, the default is 3600 seconds.
-- connector-jar-expiry-time: The expiration time of the connector Jar package. The default is 600 seconds.
+- connector-jar-storage-enable: Enable uploading the connector jar package before executing the job.
+- connector-jar-storage-mode: Connector jar package storage mode, two storage modes are available: shared mode (SHARED) and isolation mode (ISOLATED).
+- connector-jar-storage-path: The local storage path of the user-defined connector jar package on the Zeta engine.
+- connector-jar-cleanup-task-interval: Zeta engine connector jar package scheduled cleanup task interval, the default is 3600 seconds.
+- connector-jar-expiry-time: The expiration time of the connector jar package. The default is 600 seconds.
## SharedConnectorJarStorageStrategy
-Before the job is submitted, the connector Jar package will be uploaded to the Master node. Different jobs can share connector jars on the Master node if they use the same Jar package file.
-All Jar package files are persisted to a shared file storage path, and Jar packages that reference the Master node can be shared between different jobs. After the task execution is completed,
-the SharedConnectorJarStorageStrategy will not immediately delete all Jar packages related to the current task execution,but instead has an independent thread responsible for cleaning up the work.
-The configuration in the following configuration file sets the running time of the cleaning work and the survival time of the Jar package.
+Before the job is submitted, the connector jar package will be uploaded to the Master node. Different jobs can share connector jars on the Master node if they use the same Jar package file.
+All jar package files are persisted to a shared file storage path, and jar packages that reference the Master node can be shared between different jobs. After the task execution is completed,
+the SharedConnectorJarStorageStrategy will not immediately delete all jar packages related to the current task execution,but instead has an independent thread responsible for cleaning up the work.
+The configuration in the following configuration file sets the running time of the cleaning work and the survival time of the jar package.
Example:
@@ -87,9 +87,9 @@ jar-storage:
```
Detailed explanation of configuration parameters:
-- connector-jar-storage-enable: Enable uploading the connector Jar package before executing the job.
-- connector-jar-storage-mode: Connector Jar package storage mode, two storage modes are available: shared mode (SHARED) and isolation mode (ISOLATED).
-- connector-jar-storage-path: The local storage path of the user-defined connector Jar package on the Zeta engine.
-- connector-jar-cleanup-task-interval: Zeta engine connector Jar package scheduled cleanup task interval, the default is 3600 seconds.
-- connector-jar-expiry-time: The expiration time of the connector Jar package. The default is 600 seconds.
+- connector-jar-storage-enable: Enable uploading the connector jar package before executing the job.
+- connector-jar-storage-mode: Connector jar package storage mode, two storage modes are available: shared mode (SHARED) and isolation mode (ISOLATED).
+- connector-jar-storage-path: The local storage path of the user-defined connector jar package on the Zeta engine.
+- connector-jar-cleanup-task-interval: Zeta engine connector Jjr package scheduled cleanup task interval, the default is 3600 seconds.
+- connector-jar-expiry-time: The expiration time of the connector jar package. The default is 600 seconds.
diff --git a/docs/en/seatunnel-engine/hybrid-cluster-deployment.md b/docs/en/seatunnel-engine/hybrid-cluster-deployment.md
index 746eb25419b..98f3eba2450 100644
--- a/docs/en/seatunnel-engine/hybrid-cluster-deployment.md
+++ b/docs/en/seatunnel-engine/hybrid-cluster-deployment.md
@@ -5,13 +5,13 @@ sidebar_position: 5
# Deploy SeaTunnel Engine Hybrid Mode Cluster
-The Master service and Worker service of SeaTunnel Engine are mixed in the same process, and all nodes can run jobs and participate in the election to become master, that is, the master node is also running synchronous tasks simultaneously. In this mode, the Imap (which saves the status information of the task to provide support for the task's fault tolerance) data will be distributed across all nodes.
+The Master service and Worker service of SeaTunnel Engine are mixed in the same process, and all nodes can run jobs and participate in the election to become master. The master node is also running synchronous tasks simultaneously. In this mode, the Imap (which saves the status information of the task to provide support for the task's fault tolerance) data will be distributed across all nodes.
-Usage Recommendation: It is recommended to use the [separated cluster mode](separated-cluster-deployment.md). In the hybrid cluster mode, the Master node needs to run tasks synchronously. When the task scale is large, it will affect the stability of the Master node. Once the Master node crashes or the heartbeat times out, it will cause the Master node to switch, and the Master node switch will cause all running tasks to perform fault tolerance, further increasing the load on the cluster. Therefore, we recommend using the [separated cluster mode](separated-cluster-deployment.md).
+Usage Recommendation: It is recommended to use the [Separated Cluster Mode](separated-cluster-deployment.md). In the hybrid cluster mode, the Master node needs to run tasks synchronously. When the task scale is large, it will affect the stability of the Master node. Once the Master node crashes or the heartbeat times out, it will cause the Master node to switch, and the Master node switch will cause all running tasks to perform fault tolerance, further increasing the load on the cluster. Therefore, we recommend using the [Separated Cluster Mode](separated-cluster-deployment.md).
## 1. Download
-[Download and Create the SeaTunnel Installation Package](download-seatunnel.md)
+[Download And Create The SeaTunnel Installation Package](download-seatunnel.md)
## 2. Configure SEATUNNEL_HOME
@@ -22,7 +22,7 @@ export SEATUNNEL_HOME=${seatunnel install path}
export PATH=$PATH:$SEATUNNEL_HOME/bin
```
-## 3. Configure the JVM Options for the SeaTunnel Engine
+## 3. Configure The JVM Options For The SeaTunnel Engine
The SeaTunnel Engine supports two methods for setting JVM options:
@@ -32,11 +32,11 @@ The SeaTunnel Engine supports two methods for setting JVM options:
2. Add JVM options when starting the SeaTunnel Engine. For example, `seatunnel-cluster.sh -DJvmOption="-Xms2G -Xmx2G"`
-## 4. Configure the SeaTunnel Engine
+## 4. Configure The SeaTunnel Engine
The SeaTunnel Engine provides many functions that need to be configured in the `seatunnel.yaml` file.
-### 4.1 Backup count setting for data in Imap
+### 4.1 Backup Count Setting For Data In Imap
The SeaTunnel Engine implements cluster management based on [Hazelcast IMDG](https://docs.hazelcast.com/imdg/4.1/). The cluster's status data (job running status, resource status) is stored in the [Hazelcast IMap](https://docs.hazelcast.com/imdg/4.1/data-structures/map).
The data stored in the Hazelcast IMap is distributed and stored on all nodes in the cluster. Hazelcast partitions the data stored in the Imap. Each partition can specify the number of backups.
@@ -53,7 +53,7 @@ seatunnel:
# Other configurations
```
-### 4.2 Slot configuration
+### 4.2 Slot Configuration
The number of slots determines the number of task groups that the cluster node can run in parallel. The formula for the number of slots required for a task is N = 2 + P (the parallelism configured by the task). By default, the number of slots in the SeaTunnel Engine is dynamic, that is, there is no limit on the number. We recommend that the number of slots be set to twice the number of CPU cores on the node.
@@ -77,7 +77,7 @@ seatunnel:
slot-num: 20
```
-### 4.3_checkpoint manager
+### 4.3 Checkpoint Manager
Like Flink, the SeaTunnel Engine supports the Chandy–Lamport algorithm. Therefore, it is possible to achieve data synchronization without data loss and duplication.
@@ -111,7 +111,7 @@ If the cluster has more than one node, the checkpoint storage must be a distribu
For information about checkpoint storage, you can refer to [Checkpoint Storage](checkpoint-storage.md)
-# 4.4 Expiration configuration for historical jobs
+### 4.4 Expiration Configuration For Historical Jobs
The information of each completed job, such as status, counters, and error logs, is stored in the IMap object. As the number of running jobs increases, the memory usage will increase, and eventually, the memory will overflow. Therefore, you can adjust the `history-job-expire-minutes` parameter to address this issue. The time unit for this parameter is minutes. The default value is 1440 minutes, which is one day.
@@ -123,7 +123,7 @@ seatunnel:
history-job-expire-minutes: 1440
```
-# 4.5 Class Loader Cache Mode
+### 4.5 Class Loader Cache Mode
This configuration primarily addresses the issue of resource leakage caused by constantly creating and attempting to destroy the class loader.
If you encounter exceptions related to metaspace overflow, you can try enabling this configuration.
@@ -137,15 +137,15 @@ seatunnel:
classloader-cache-mode: true
```
-# 5. Configure the SeaTunnel Engine network service
+## 5. Configure The SeaTunnel Engine Network Service
All SeaTunnel Engine network-related configurations are in the `hazelcast.yaml` file.
-# 5.1 Cluster name
+### 5.1 Cluster Name
The SeaTunnel Engine node uses the `cluster-name` to determine if another node is in the same cluster as itself. If the cluster names of the two nodes are different, the SeaTunnel Engine will reject the service request.
-# 5.2 Network
+### 5.2 Network
Based on [Hazelcast](https://docs.hazelcast.com/imdg/4.1/clusters/discovery-mechanisms), a SeaTunnel Engine cluster is a network composed of cluster members running the SeaTunnel Engine server. Cluster members automatically join together to form a cluster. This automatic joining occurs through various discovery mechanisms used by cluster members to detect each other.
@@ -177,13 +177,13 @@ hazelcast:
TCP is the recommended method for use in a standalone SeaTunnel Engine cluster.
-Alternatively, Hazelcast provides several other service discovery methods. For more details, please refer to [hazelcast network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
+Alternatively, Hazelcast provides several other service discovery methods. For more details, please refer to [Hazelcast Network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sidebar_position: 5
-------------------
-# 5.3 IMap Persistence Configuration
+### 5.3 IMap Persistence Configuration
In SeaTunnel, we use IMap (a distributed Map that enables the writing and reading of data across nodes and processes. For more information, please refer to [hazelcast map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)) to store the status of each task and task, allowing us to recover tasks and achieve task fault tolerance in the event of a node failure.
@@ -265,15 +265,15 @@ map:
fs.oss.credentials.provider: org.apache.hadoop.fs.aliyun.oss.AliyunCredentialsProvider
```
-# 6. Configure the SeaTunnel Engine client
+## 6. Configure The SeaTunnel Engine Client
All SeaTunnel Engine client configurations are in the `hazelcast-client.yaml`.
-# 6.1 cluster-name
+### 6.1 cluster-name
The client must have the same `cluster-name` as the SeaTunnel Engine. Otherwise, the SeaTunnel Engine will reject the client's request.
-# 6.2 Network
+### 6.2 network
**cluster-members**
@@ -289,7 +289,7 @@ hazelcast-client:
- hostname1:5801
```
-# 7. Start the SeaTunnel Engine server node
+## 7. Start The SeaTunnel Engine Server Node
It can be started with the `-d` parameter through the daemon.
@@ -300,10 +300,10 @@ mkdir -p $SEATUNNEL_HOME/logs
The logs will be written to `$SEATUNNEL_HOME/logs/seatunnel-engine-server.log`
-# 8. Install the SeaTunnel Engine client
+## 8. Install The SeaTunnel Engine Client
You only need to copy the `$SEATUNNEL_HOME` directory on the SeaTunnel Engine node to the client node and configure `SEATUNNEL_HOME` in the same way as the SeaTunnel Engine server node.
-# 9. Submit and manage jobs
+## 9. Submit And Manage Jobs
-Now that the cluster is deployed, you can complete the submission and management of jobs through the following tutorials: [Submit and manage jobs](user-command.md)
+Now that the cluster is deployed, you can complete the submission and management of jobs through the following tutorials: [Submit And Manage Jobs](user-command.md)
diff --git a/docs/en/seatunnel-engine/local-mode-deployment.md b/docs/en/seatunnel-engine/local-mode-deployment.md
index 08b700dd445..f4cd0bcb2c5 100644
--- a/docs/en/seatunnel-engine/local-mode-deployment.md
+++ b/docs/en/seatunnel-engine/local-mode-deployment.md
@@ -3,7 +3,7 @@
sidebar_position: 4
-------------------
-# Run Jobs in Local Mode
+# Run Jobs In Local Mode
Only for testing.
@@ -14,9 +14,9 @@ In local mode, each task will start a separate process, and the process will exi
3. Jobs cannot be cancelled via commands, only by killing the process.
4. REST API is not supported.
-The [separated cluster mode](separated-cluster-deployment.md) of SeaTunnel Engine is recommended for use in production environments.
+The [Separated Cluster Mode](separated-cluster-deployment.md) of SeaTunnel Engine is recommended for use in production environments.
-## Deploying SeaTunnel Engine in Local Mode
+## Deploying SeaTunnel Engine In Local Mode
In local mode, there is no need to deploy a SeaTunnel Engine cluster. You only need to use the following command to submit jobs. The system will start the SeaTunnel Engine (Zeta) service in the process that submitted the job to run the submitted job, and the process will exit after the job is completed.
@@ -25,7 +25,7 @@ In this mode, you only need to copy the downloaded and created installation pack
## Submitting Jobs
```shell
-$SEATUNNEL_HOME/bin/seatunnel.sh --config $SEATUNNEL_HOME/config/v2.batch.config.template -e local
+$SEATUNNEL_HOME/bin/seatunnel.sh --config $SEATUNNEL_HOME/config/v2.batch.config.template -m local
```
## Job Operations
diff --git a/docs/en/seatunnel-engine/resource-isolation.md b/docs/en/seatunnel-engine/resource-isolation.md
index f123e809821..cd336aac940 100644
--- a/docs/en/seatunnel-engine/resource-isolation.md
+++ b/docs/en/seatunnel-engine/resource-isolation.md
@@ -5,7 +5,7 @@ sidebar_position: 9
After version 2.3.6. SeaTunnel can add `tag` to each worker node, when you submit job you can use `tag_filter` to filter the node you want run this job.
-# How to archive this:
+# How To Archive This:
1. update the config in `hazelcast.yaml`,
diff --git a/docs/en/seatunnel-engine/rest-api.md b/docs/en/seatunnel-engine/rest-api.md
index ef71814cfbf..99bba92dae0 100644
--- a/docs/en/seatunnel-engine/rest-api.md
+++ b/docs/en/seatunnel-engine/rest-api.md
@@ -3,14 +3,14 @@
sidebar_position: 11
--------------------
-# REST API
+# RESTful API
SeaTunnel has a monitoring API that can be used to query status and statistics of running jobs, as well as recent
-completed jobs. The monitoring API is a REST-ful API that accepts HTTP requests and responds with JSON data.
+completed jobs. The monitoring API is a RESTful API that accepts HTTP requests and responds with JSON data.
## Overview
-The monitoring API is backed by a web server that runs as part of the node, each node member can provide rest api capability.
+The monitoring API is backed by a web server that runs as part of the node, each node member can provide RESTful api capability.
By default, this server listens at port 5801, which can be configured in hazelcast.yaml like :
```yaml
@@ -70,7 +70,7 @@ network:
------------------------------------------------------------------------------------------
-### Returns an overview over all jobs and their current state.
+### Returns An Overview And State Of All Jobs
GET
/hazelcast/rest/maps/running-jobs
(Returns an overview over all jobs and their current state.)
@@ -109,7 +109,7 @@ network:
------------------------------------------------------------------------------------------
-### Return details of a job.
+### Return Details Of A Job
GET
/hazelcast/rest/maps/job-info/:jobId
(Return details of a job. )
@@ -164,7 +164,7 @@ When we can't get the job info, the response will be:
------------------------------------------------------------------------------------------
-### Return details of a job.
+### Return Details Of A Job
This API has been deprecated, please use /hazelcast/rest/maps/job-info/:jobId instead
@@ -221,7 +221,7 @@ When we can't get the job info, the response will be:
------------------------------------------------------------------------------------------
-### Return all finished Jobs Info.
+### Return All Finished Jobs Info
GET
/hazelcast/rest/maps/finished-jobs/:state
(Return all finished Jobs Info.)
@@ -253,7 +253,7 @@ When we can't get the job info, the response will be:
------------------------------------------------------------------------------------------
-### Returns system monitoring information.
+### Returns System Monitoring Information
GET
/hazelcast/rest/maps/system-monitoring-information
(Returns system monitoring information.)
@@ -318,7 +318,7 @@ When we can't get the job info, the response will be:
------------------------------------------------------------------------------------------
-### Submit Job.
+### Submit A Job
POST
/hazelcast/rest/maps/submit-job
(Returns jobId and jobName if job submitted successfully.)
@@ -376,7 +376,7 @@ When we can't get the job info, the response will be:
------------------------------------------------------------------------------------------
-### Stop Job.
+### Stop A Job
POST
/hazelcast/rest/maps/stop-job
(Returns jobId if job stoped successfully.)
@@ -402,7 +402,7 @@ When we can't get the job info, the response will be:
------------------------------------------------------------------------------------------
-### Encrypt Config.
+### Encrypt Config
POST
/hazelcast/rest/maps/encrypt-config
(Returns the encrypted config if config is encrypted successfully.)
diff --git a/docs/en/seatunnel-engine/savepoint.md b/docs/en/seatunnel-engine/savepoint.md
index 4996c12bb52..06d4e6b6b34 100644
--- a/docs/en/seatunnel-engine/savepoint.md
+++ b/docs/en/seatunnel-engine/savepoint.md
@@ -3,11 +3,11 @@
sidebar_position: 8
-------------------
-# savepoint and restore with savepoint
+# Savepoint And Restore With Savepoint
-savepoint is created using the checkpoint. a global mirror of job execution status, which can be used for job or seatunnel stop and recovery, upgrade, etc.
+Savepoint is created for using the checkpoint. A global mirror of job execution status can be used for job or seatunnel stop and recovery, upgrade, etc.
-## use savepoint
+## Use Savepoint
To use savepoint, you need to ensure that the connector used by the job supports checkpoint, otherwise data may be lost or duplicated.
@@ -18,7 +18,7 @@ To use savepoint, you need to ensure that the connector used by the job supports
After successful execution, the checkpoint data will be saved and the task will end.
-## use restore with savepoint
+## Use Restore With Savepoint
Resume from savepoint using jobId
```./bin/seatunnel.sh -c {jobConfig} -r {jobId}```
diff --git a/docs/en/seatunnel-engine/separated-cluster-deployment.md b/docs/en/seatunnel-engine/separated-cluster-deployment.md
index 5f48fd11348..714c8920a44 100644
--- a/docs/en/seatunnel-engine/separated-cluster-deployment.md
+++ b/docs/en/seatunnel-engine/separated-cluster-deployment.md
@@ -3,17 +3,17 @@
sidebar_position: 6
-------------------
-# Deploy SeaTunnel Engine in Separated Cluster Mode
+# Deploy SeaTunnel Engine In Separated Cluster Mode
-The Master service and Worker service of SeaTunnel Engine are separated, and each service is a separate process. The Master node is only responsible for job scheduling, REST API, task submission, etc., and the Imap data is only stored on the Master node. The Worker node is only responsible for the execution of tasks and does not participate in the election to become the master nor stores Imap data.
+The Master service and Worker service of SeaTunnel Engine are separated, and each service is a separate process. The Master node is only responsible for job scheduling, RESTful API, task submission, etc., and the Imap data is only stored on the Master node. The Worker node is only responsible for the execution of tasks and does not participate in the election to become the master nor stores Imap data.
Among all the Master nodes, only one Master node works at the same time, and the other Master nodes are in the standby state. When the current Master node fails or the heartbeat times out, a new Master Active node will be elected from the other Master nodes.
-This is the most recommended usage method. In this mode, the load on the Master will be very small, and the Master has more resources for job scheduling, task fault tolerance index monitoring, and providing REST API services, etc., and will have higher stability. At the same time, the Worker node does not store Imap data. All Imap data is stored on the Master node. Even if the Worker node has a high load or crashes, it will not cause the Imap data to be redistributed.
+This is the most recommended usage method. In this mode, the load on the Master will be very low, and the Master has more resources for job scheduling, task fault tolerance index monitoring, and providing RESTful API services, etc., and will have higher stability. At the same time, the Worker node does not store Imap data. All Imap data is stored on the Master node. Even if the Worker node has a high load or crashes, it will not cause the Imap data to be redistributed.
## 1. Download
-[Download and Make SeaTunnel Installation Package](download-seatunnel.md)
+[Download And Make SeaTunnel Installation Package](download-seatunnel.md)
## 2. Configure SEATUNNEL_HOME
@@ -24,7 +24,7 @@ export SEATUNNEL_HOME=${seatunnel install path}
export PATH=$PATH:$SEATUNNEL_HOME/bin
```
-## 3. Configure JVM Options for Master Nodes
+## 3. Configure JVM Options For Master Nodes
The JVM parameters of the Master node are configured in the `$SEATUNNEL_HOME/config/jvm_master_options` file.
@@ -275,11 +275,11 @@ map:
All network-related configurations of the SeaTunnel Engine are in the `hazelcast-master.yaml` and `hazelcast-worker.yaml` files.
-### 5.1 Cluster Name
+### 5.1 cluster-name
SeaTunnel Engine nodes use the `cluster-name` to determine whether another node is in the same cluster as themselves. If the cluster names between two nodes are different, the SeaTunnel Engine will reject service requests.
-### 5.2 Network
+### 5.2 network
Based on [Hazelcast](https://docs.hazelcast.com/imdg/4.1/clusters/discovery-mechanisms), a SeaTunnel Engine cluster is a network composed of cluster members running the SeaTunnel Engine server. Cluster members automatically join together to form a cluster. This automatic joining is through the various discovery mechanisms used by cluster members to discover each other.
@@ -287,7 +287,7 @@ Please note that after the cluster is formed, the communication between cluster
The SeaTunnel Engine uses the following discovery mechanisms.
-#### TCP
+#### tcp-ip
You can configure the SeaTunnel Engine as a complete TCP/IP cluster. For configuration details, please refer to the [Discovering Members by TCP section](tcp.md).
@@ -367,7 +367,7 @@ mkdir -p $SEATUNNEL_HOME/logs
The logs will be written to `$SEATUNNEL_HOME/logs/seatunnel-engine-master.log`.
-## 7. Starting the SeaTunnel Engine Worker Node
+## 7. Starting The SeaTunnel Engine Worker Node
It can be started using the `-d` parameter through the daemon.
@@ -378,7 +378,7 @@ mkdir -p $SEATUNNEL_HOME/logs
The logs will be written to `$SEATUNNEL_HOME/logs/seatunnel-engine-worker.log`.
-## 8. Installing the SeaTunnel Engine Client
+## 8. Installing The SeaTunnel Engine Client
### 8.1 Setting the `SEATUNNEL_HOME` the same as the server
@@ -389,7 +389,7 @@ export SEATUNNEL_HOME=${seatunnel install path}
export PATH=$PATH:$SEATUNNEL_HOME/bin
```
-### 8.2 Configuring the SeaTunnel Engine Client
+### 8.2 Configuring The SeaTunnel Engine Client
All configurations of the SeaTunnel Engine client are in the `hazelcast-client.yaml`.
@@ -412,6 +412,6 @@ hazelcast-client:
- master-node-2:5801
```
-# 9 Submitting and Managing Jobs
+# 9 Submitting And Managing Jobs
-Now that the cluster has been deployed, you can complete the job submission and management through the following tutorial: [Submitting and Managing Jobs](user-command.md).
+Now that the cluster has been deployed, you can complete the job submission and management through the following tutorial: [Submitting And Managing Jobs](user-command.md).
diff --git a/docs/en/seatunnel-engine/tcp.md b/docs/en/seatunnel-engine/tcp.md
index bd9f2d1ba5d..b28907ac8f1 100644
--- a/docs/en/seatunnel-engine/tcp.md
+++ b/docs/en/seatunnel-engine/tcp.md
@@ -3,7 +3,7 @@
sidebar_position: 10
--------------------
-# TCP NetWork
+# TCP Network
If multicast is not the preferred way of discovery for your environment, then you can configure SeaTunnel Engine to be a full TCP/IP cluster. When you configure SeaTunnel Engine to discover members by TCP/IP, you must list all or a subset of the members' host names and/or IP addresses as cluster members. You do not have to list all of these cluster members, but at least one of the listed members has to be active in the cluster when a new member joins.
diff --git a/docs/en/seatunnel-engine/user-command.md b/docs/en/seatunnel-engine/user-command.md
index bd5c41be717..a18ec931e09 100644
--- a/docs/en/seatunnel-engine/user-command.md
+++ b/docs/en/seatunnel-engine/user-command.md
@@ -28,7 +28,7 @@ Usage: seatunnel.sh [options]
--decrypt Decrypt the config file. When both --decrypt and --encrypt are specified, only --encrypt will take effect (default: false).
-m, --master, -e, --deploy-mode SeaTunnel job submit master, support [local, cluster] (default: cluster).
--encrypt Encrypt the config file. When both --decrypt and --encrypt are specified, only --encrypt will take effect (default: false).
- --get_running_job_metrics Gets metrics for running jobs (default: false).
+ --get_running_job_metrics Get metrics for running jobs (default: false).
-h, --help Show the usage message.
-j, --job-id Get the job status by JobId.
-l, --list List the job status (default: false).
@@ -58,7 +58,7 @@ The **-n** or **--name** parameter can specify the name of the job.
sh bin/seatunnel.sh --config $SEATUNNEL_HOME/config/v2.batch.config.template --async -n myjob
```
-## Viewing the Job List
+## Viewing The Job List
```shell
sh bin/seatunnel.sh -l
@@ -66,7 +66,7 @@ sh bin/seatunnel.sh -l
This command will output the list of all jobs in the current cluster (including completed historical jobs and running jobs).
-## Viewing the Job Status
+## Viewing The Job Status
```shell
sh bin/seatunnel.sh -j <jobId>
@@ -74,7 +74,7 @@ sh bin/seatunnel.sh -j <jobId>
This command will output the status information of the specified job.
-## Getting the Monitoring Information of Running Jobs
+## Getting The Monitoring Information Of Running Jobs
```shell
sh bin/seatunnel.sh --get_running_job_metrics
diff --git a/docs/zh/other-engine/flink.md b/docs/zh/other-engine/flink.md
index a9aa7055a2e..856aeb78101 100644
--- a/docs/zh/other-engine/flink.md
+++ b/docs/zh/other-engine/flink.md
@@ -1,10 +1,10 @@
-# Seatunnel runs on Flink
+# Flink引擎方式运行SeaTunnel
-Flink是一个强大的高性能分布式流处理引擎,更多关于它的信息,你可以搜索 `Apache Flink`。
+Flink是一个强大的高性能分布式流处理引擎。你可以搜索 `Apache Flink`获取更多关于它的信息。
### 在Job中设置Flink的配置信息
-从 `flink` 开始:
+以 `flink.` 开始:
例子: 我对这个项目设置一个精确的检查点
@@ -15,10 +15,10 @@ env {
}
```
-枚举类型当前还不支持,你需要在Flink的配置文件中指定它们,暂时只有这些类型的设置受支持:
+枚举类型当前还不支持,你需要在Flink的配置文件中指定它们。暂时只有这些类型的设置受支持:
Integer/Boolean/String/Duration
-### 如何设置一个简单的Flink job
+### 如何设置一个简单的Flink Job
这是一个运行在Flink中随机生成数据打印到控制台的简单job
@@ -78,6 +78,6 @@ sink{
}
```
-### 如何在项目中运行job
+### 如何在项目中运行Job
-当你将代码拉到本地后,转到 `seatunnel-examples/seatunnel-flink-connector-v2-example` 模块,查找 `org.apache.seatunnel.example.flink.v2.SeaTunnelApiExample` 即可完成job的操作
+当你将代码拉到本地后,转到 `seatunnel-examples/seatunnel-flink-connector-v2-example` 模块,查找 `org.apache.seatunnel.example.flink.v2.SeaTunnelApiExample` 即可完成job的操作。
diff --git a/docs/zh/seatunnel-engine/about.md b/docs/zh/seatunnel-engine/about.md
index ca65cac142a..9deeec82f98 100644
--- a/docs/zh/seatunnel-engine/about.md
+++ b/docs/zh/seatunnel-engine/about.md
@@ -5,7 +5,7 @@ sidebar_position: 1
# SeaTunnel Engine 简介
-SeaTunnel Engine 是一个由社区开发的用于数据同步场景的引擎,作为 SeaTunnel 的默认引擎,它支持高吞吐量、低延迟和强一致性的数据同步作业操作,更快、更稳定、更节省资源且易于使用
+SeaTunnel Engine 是一个由社区开发的用于数据同步场景的引擎,作为 SeaTunnel 的默认引擎,它支持高吞吐量、低延迟和强一致性的数据同步作业操作,更快、更稳定、更节省资源且易于使用。
SeaTunnel Engine 的整体设计遵循以下路径:
@@ -20,7 +20,7 @@ SeaTunnel Engine 的整体设计遵循以下路径:
- 支持独立运行;
- 支持集群运行;
-- 支持自治集群(去中心化),使用户无需为 SeaTunnel Engine 集群指定主节点,因为它可以在运行过程中自行选择主节点,并且在主节点失败时自动选择新的主节点。
+- 支持自治集群(去中心化),使用户无需为 SeaTunnel Engine 集群指定主节点,因为它可以在运行过程中自行选择主节点,并且在主节点失败时自动选择新的主节点;
- 自治集群节点发现和具有相同 cluster_name 的节点将自动形成集群。
### 核心功能
diff --git a/docs/zh/seatunnel-engine/checkpoint-storage.md b/docs/zh/seatunnel-engine/checkpoint-storage.md
index ac4ac268eb3..f0c506fdbf8 100644
--- a/docs/zh/seatunnel-engine/checkpoint-storage.md
+++ b/docs/zh/seatunnel-engine/checkpoint-storage.md
@@ -14,11 +14,11 @@ sidebar_position: 7
SeaTunnel Engine支持以下检查点存储类型:
- HDFS (OSS,S3,HDFS,LocalFile)
-- LocalFile (本地),(已弃用: 使用Hdfs(LocalFile)替代).
+- LocalFile (本地),(已弃用: 使用HDFS(LocalFile)替代).
我们使用微内核设计模式将检查点存储模块从引擎中分离出来。这允许用户实现他们自己的检查点存储模块。
-`checkpoint-storage-api`是检查点存储模块API,它定义了检查点存储模块的接口。
+`checkpoint-storage-api`是检查点 存储模块API,它定义了检查点存储模块的接口。
如果你想实现你自己的检查点存储模块,你需要实现`CheckpointStorage`并提供相应的`CheckpointStorageFactory`实现。
@@ -44,9 +44,9 @@ seatunnel:
#### OSS
-阿里云oss是基于hdfs-file,所以你可以参考[hadoop oss文档](https://hadoop.apache.org/docs/stable/hadoop-aliyun/tools/hadoop-aliyun/index.html)来配置oss.
+阿里云OSS是基于hdfs-file,所以你可以参考[Hadoop OSS文档](https://hadoop.apache.org/docs/stable/hadoop-aliyun/tools/hadoop-aliyun/index.html)来配置oss.
-除了与oss buckets交互外,oss客户端需要与buckets交互所需的凭据。
+OSS buckets交互外,oss客户端需要与buckets交互所需的凭据。
客户端支持多种身份验证机制,并且可以配置使用哪种机制及其使用顺序。也可以使用of org.apache.hadoop.fs.aliyun.oss.AliyunCredentialsProvider的自定义实现。
如果您使用AliyunCredentialsProvider(可以从阿里云访问密钥管理中获得),它们包括一个access key和一个secret key。
你可以这样配置:
@@ -71,11 +71,11 @@ seatunnel:
有关Hadoop Credential Provider API的更多信息,请参见: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
-阿里云oss凭证提供程序实现见: [验证凭证提供](https://github.com/aliyun/aliyun-oss-java-sdk/tree/master/src/main/java/com/aliyun/oss/common/auth)
+阿里云OSS凭证提供程序实现见: [验证凭证提供](https://github.com/aliyun/aliyun-oss-java-sdk/tree/master/src/main/java/com/aliyun/oss/common/auth)
#### S3
-S3基于hdfs-file,所以你可以参考[hadoop s3文档](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html)来配置s3。
+S3基于hdfs-file,所以你可以参考[Hadoop s3文档](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html)来配置s3。
除了与公共S3 buckets交互之外,S3A客户端需要与buckets交互所需的凭据。
客户端支持多种身份验证机制,并且可以配置使用哪种机制及其使用顺序。也可以使用com.amazonaws.auth.AWSCredentialsProvider的自定义实现。
diff --git a/docs/zh/seatunnel-engine/download-seatunnel.md b/docs/zh/seatunnel-engine/download-seatunnel.md
index 8c228b0d71a..c108f4812a3 100644
--- a/docs/zh/seatunnel-engine/download-seatunnel.md
+++ b/docs/zh/seatunnel-engine/download-seatunnel.md
@@ -16,7 +16,7 @@ import TabItem from '@theme/TabItem';
## 步骤 2: 下载 SeaTunnel
-进入[seatunnel下载页面](https://seatunnel.apache.org/download)下载最新版本的发布版安装包`seatunnel--bin.tar.gz`
+进入[SeaTunnel下载页面](https://seatunnel.apache.org/download)下载最新版本的发布版安装包`seatunnel--bin.tar.gz`
或者您也可以通过终端下载
diff --git a/docs/zh/seatunnel-engine/hybrid-cluster-deployment.md b/docs/zh/seatunnel-engine/hybrid-cluster-deployment.md
index 4fa3ed31121..efa96da0305 100644
--- a/docs/zh/seatunnel-engine/hybrid-cluster-deployment.md
+++ b/docs/zh/seatunnel-engine/hybrid-cluster-deployment.md
@@ -109,7 +109,7 @@ seatunnel:
如果集群的节点大于1,检查点存储必须是一个分布式存储,或者共享存储,这样才能保证任意节点挂掉后依然可以在另一个节点加载到存储中的任务状态信息。
-有关检查点存储的信息,您可以查看 [checkpoint storage](checkpoint-storage.md)
+有关检查点存储的信息,您可以查看 [Checkpoint Storage](checkpoint-storage.md)
### 4.4 历史作业过期配置
@@ -155,7 +155,7 @@ SeaTunnel Engine 使用以下发现机制。
#### TCP
-您可以将 SeaTunnel Engine 配置为完整的 TCP/IP 集群。有关配置详细信息,请参阅 [Discovering Members by TCP section](tcp.md)。
+您可以将 SeaTunnel Engine 配置为完整的 TCP/IP 集群。有关配置详细信息,请参阅 [Discovering Members By TCP Section](tcp.md)。
一个示例如下 `hazelcast.yaml`
@@ -177,7 +177,7 @@ hazelcast:
TCP 是我们建议在独立 SeaTunnel Engine 集群中使用的方式。
-另一方面,Hazelcast 提供了一些其他的服务发现方法。有关详细信息,请参阅 [hazelcast network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
+另一方面,Hazelcast 提供了一些其他的服务发现方法。有关详细信息,请参阅 [Hazelcast Network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
### 5.3 IMap持久化配置
@@ -187,7 +187,7 @@ TCP 是我们建议在独立 SeaTunnel Engine 集群中使用的方式。
为了解决这个问题,我们可以将Imap中的数据持久化到外部存储中,如HDFS、OSS等。这样即使所有节点都被停止,Imap中的数据也不会丢失,当集群节点再次启动后,所有之前正在运行的任务都会被自动恢复。
-下面介绍如何使用 MapStore 持久化配置。有关详细信息,请参阅 [hazelcast map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)
+下面介绍如何使用 MapStore 持久化配置。有关详细信息,请参阅 [Hazelcast Map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)
**type**
@@ -300,6 +300,6 @@ mkdir -p $SEATUNNEL_HOME/logs
您只需将 SeaTunnel Engine 节点上的 `$SEATUNNEL_HOME` 目录复制到客户端节点,并像 SeaTunnel Engine 服务器节点一样配置 `SEATUNNEL_HOME`。
-# 9 提交作业和管理作业
+## 9. 提交作业和管理作业
现在集群部署完成了,您可以通过以下教程完成作业的提交和管理:[提交和管理作业](user-command.md)
diff --git a/docs/zh/seatunnel-engine/local-mode-deployment.md b/docs/zh/seatunnel-engine/local-mode-deployment.md
index a1e2cf5ec12..0230cfcca1a 100644
--- a/docs/zh/seatunnel-engine/local-mode-deployment.md
+++ b/docs/zh/seatunnel-engine/local-mode-deployment.md
@@ -12,7 +12,7 @@ Local模式下每个任务都会启动一个独立的进程,任务运行完成
1. 不支持任务的暂停、恢复。
2. 不支持获取任务列表查看。
3. 不支持通过命令取消作业,只能通过Kill进程的方式终止任务。
-4. 不支持rest api。
+4. 不支持RESTful API。
最推荐在生产环境中使用SeaTunnel Engine的[分离集群模式](separated-cluster-deployment.md)
@@ -20,7 +20,7 @@ Local模式下每个任务都会启动一个独立的进程,任务运行完成
本地模式下,不需要部署SeaTunnel Engine集群,只需要使用如下命令即可提交作业即可。系统会在提交提交作业的进程中启动SeaTunnel Engine(Zeta)服务来运行提交的作业,作业完成后进程退出。
-该模式下只需要将下载和制作好的安装包拷贝到需要运行的服务器上即可,如果需要调整作业运行的jvm参数,可以修改$SEATUNNEL_HOME/config/jvm_client_options文件。
+该模式下只需要将下载和制作好的安装包拷贝到需要运行的服务器上即可,如果需要调整作业运行的JVM参数,可以修改$SEATUNNEL_HOME/config/jvm_client_options文件。
## 提交作业
diff --git a/docs/zh/seatunnel-engine/rest-api.md b/docs/zh/seatunnel-engine/rest-api.md
index baa38f4cd98..1b0166425ba 100644
--- a/docs/zh/seatunnel-engine/rest-api.md
+++ b/docs/zh/seatunnel-engine/rest-api.md
@@ -3,9 +3,9 @@
sidebar_position: 11
--------------------
-# REST API
+# RESTful API
-SeaTunnel有一个用于监控的API,可用于查询运行作业的状态和统计信息,以及最近完成的作业。监控API是REST-ful风格的,它接受HTTP请求并使用JSON数据格式进行响应。
+SeaTunnel有一个用于监控的API,可用于查询运行作业的状态和统计信息,以及最近完成的作业。监控API是RESTful风格的,它接受HTTP请求并使用JSON数据格式进行响应。
## 概述
diff --git a/docs/zh/seatunnel-engine/separated-cluster-deployment.md b/docs/zh/seatunnel-engine/separated-cluster-deployment.md
index f6c014c8579..76476777374 100644
--- a/docs/zh/seatunnel-engine/separated-cluster-deployment.md
+++ b/docs/zh/seatunnel-engine/separated-cluster-deployment.md
@@ -5,7 +5,7 @@ sidebar_position: 6
# 部署 SeaTunnel Engine 分离模式集群
-SeaTunnel Engine 的Master服务和Worker服务分离,每个服务单独一个进程。Master节点只负责作业调度,rest api,任务提交等,Imap数据只存储在Master节点中。Worker节点只负责任务的执行,不参与选举成为master,也不存储Imap数据。
+SeaTunnel Engine 的Master服务和Worker服务分离,每个服务单独一个进程。Master节点只负责作业调度,RESTful API,任务提交等,Imap数据只存储在Master节点中。Worker节点只负责任务的执行,不参与选举成为master,也不存储Imap数据。
在所有Master节点中,同一时间只有一个Master节点工作,其他Master节点处于standby状态。当当前Master节点宕机或心跳超时,会从其它Master节点中选举出一个新的Master Active节点。
@@ -159,7 +159,7 @@ seatunnel:
:::
-有关检查点存储的信息,您可以查看 [checkpoint storage](checkpoint-storage.md)
+有关检查点存储的信息,您可以查看 [Checkpoint Storage](checkpoint-storage.md)
### 4.4 历史作业过期配置
@@ -195,13 +195,13 @@ seatunnel:
:::
-在SeaTunnel中,我们使用IMap(一种分布式的Map,可以实现数据跨节点跨进程的写入的读取 有关详细信息,请参阅 [hazelcast map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)) 来存储每个任务及其task的状态,以便在任务所在节点宕机后,可以在其他节点上获取到任务之前的状态信息,从而恢复任务实现任务的容错。
+在SeaTunnel中,我们使用IMap(一种分布式的Map,可以实现数据跨节点跨进程的写入的读取 有关详细信息,请参阅 [Hazelcast Map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)) 来存储每个任务及其task的状态,以便在任务所在节点宕机后,可以在其他节点上获取到任务之前的状态信息,从而恢复任务实现任务的容错。
默认情况下Imap的信息只是存储在内存中,我们可以设置Imap数据的复本数,具体可参考(4.1 Imap中数据的备份数设置),如果复本数是2,代表每个数据会同时存储在2个不同的节点中。一旦节点宕机,Imap中的数据会重新在其它节点上自动补充到设置的复本数。但是当所有节点都被停止后,Imap中的数据会丢失。当集群节点再次启动后,所有之前正在运行的任务都会被标记为失败,需要用户手工通过seatunnel.sh -r 指令恢复运行。
为了解决这个问题,我们可以将Imap中的数据持久化到外部存储中,如HDFS、OSS等。这样即使所有节点都被停止,Imap中的数据也不会丢失,当集群节点再次启动后,所有之前正在运行的任务都会被自动恢复。
-下面介绍如何使用 MapStore 持久化配置。有关详细信息,请参阅 [hazelcast map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)
+下面介绍如何使用 MapStore 持久化配置。有关详细信息,请参阅 [Hazelcast Map](https://docs.hazelcast.com/imdg/4.2/data-structures/map)
**type**
@@ -360,7 +360,7 @@ hazelcast:
TCP 是我们建议在独立 SeaTunnel Engine 集群中使用的方式。
-另一方面,Hazelcast 提供了一些其他的服务发现方法。有关详细信息,请参阅 [hazelcast network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
+另一方面,Hazelcast 提供了一些其他的服务发现方法。有关详细信息,请参阅 [Hazelcast Network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
## 6. 启动 SeaTunnel Engine Master 节点
@@ -418,6 +418,6 @@ hazelcast-client:
- master-node-2:5801
```
-# 9 提交作业和管理作业
+## 9. 提交作业和管理作业
现在集群部署完成了,您可以通过以下教程完成作业的提交和管理:[提交和管理作业](user-command.md)