From 574bc767c892aee086bd4be107535715d7d215af Mon Sep 17 00:00:00 2001 From: Frank945946 <108602632+Frank945946@users.noreply.github.com> Date: Wed, 27 Dec 2023 09:57:28 +0800 Subject: [PATCH 1/2] This is an automated cherry-pick of #15859 Signed-off-by: ti-chi-bot --- external-storage-uri.md | 8 +------- sql-statements/sql-statement-import-into.md | 18 ++++++++---------- tidb-lightning/tidb-lightning-overview.md | 1 - .../tidb-lightning-physical-import-mode.md | 2 +- 4 files changed, 10 insertions(+), 19 deletions(-) diff --git a/external-storage-uri.md b/external-storage-uri.md index b9ddf28d1c007..80c723ded2d05 100644 --- a/external-storage-uri.md +++ b/external-storage-uri.md @@ -79,14 +79,8 @@ gcs://external/test.csv?credentials-file=${credentials-file-path} - `encryption-scope`: Specifies the [encryption scope](https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-scope-manage?tabs=powershell#upload-a-blob-with-an-encryption-scope) for server-side encryption. - `encryption-key`: Specifies the [encryption key](https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-customer-provided-keys) for server-side encryption, which uses the AES256 encryption algorithm. -The following is an example of an Azure Blob Storage URI for TiDB Lightning and BR. In this example, you need to specify a specific file path `testfolder`. +The following is an example of an Azure Blob Storage URI for BR. In this example, you need to specify a specific file path `testfolder`. ```shell azure://external/testfolder?account-name=${account-name}&account-key=${account-key} ``` - -The following is an example of an Azure Blob Storage URI for [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). In this example, you need to specify a specific filename `test.csv`. - -```shell -azure://external/test.csv?account-name=${account-name}&account-key=${account-key} -``` \ No newline at end of file diff --git a/sql-statements/sql-statement-import-into.md b/sql-statements/sql-statement-import-into.md index 225a8a9fc3f10..28d60238c04a1 100644 --- a/sql-statements/sql-statement-import-into.md +++ b/sql-statements/sql-statement-import-into.md @@ -19,7 +19,11 @@ This TiDB statement is not applicable to TiDB Cloud. `IMPORT INTO` supports importing data from files stored in Amazon S3, GCS, and the TiDB local storage. +<<<<<<< HEAD - For data files stored in Amazon S3, GCS, or Azure Blob Storage, `IMPORT INTO` supports running in the [TiDB backend task distributed execution framework](/tidb-distributed-execution-framework.md). +======= +- For data files stored in Amazon S3 or GCS, `IMPORT INTO` supports running in the [TiDB Distributed eXecution Framework (DXF)](/tidb-distributed-execution-framework.md). +>>>>>>> 7752b8ad60 (Removed unsupported Azure Blob storage via import into (#15859)) - When this framework is enabled ([tidb_enable_dist_task](/system-variables.md#tidb_enable_dist_task-new-in-v710) is `ON`), `IMPORT INTO` splits a data import job into multiple sub-jobs and distributes these sub-jobs to different TiDB nodes for execution to improve the import efficiency. - When this framework is disabled, `IMPORT INTO` only supports running on the TiDB node where the current user is connected. @@ -94,9 +98,9 @@ In the left side of the `SET` expression, you can only reference a column name t ### fileLocation -It specifies the storage location of the data file, which can be an Amazon S3, GCS, or Azure Blob Storage URI path, or a TiDB local file path. +It specifies the storage location of the data file, which can be an Amazon S3 or GCS URI path, or a TiDB local file path. -- Amazon S3, GCS, or Azure Blob Storage URI path: for URI configuration details, see [URI Formats of External Storage Services](/external-storage-uri.md). +- Amazon S3 or GCS URI path: for URI configuration details, see [URI Formats of External Storage Services](/external-storage-uri.md). - TiDB local file path: it must be an absolute path, and the file extension must be `.csv`, `.sql`, or `.parquet`. Make sure that the files corresponding to this path are stored on the TiDB node connected by the current user, and the user has the `FILE` privilege. > **Note:** @@ -240,7 +244,7 @@ Assume that there are three files named `file-01.csv`, `file-02.csv`, and `file- IMPORT INTO t FROM '/path/to/file-*.csv' ``` -### Import data files from Amazon S3, GCS, or Azure Blob Storage +### Import data files from Amazon S3 or GCS - Import data files from Amazon S3: @@ -254,13 +258,7 @@ IMPORT INTO t FROM '/path/to/file-*.csv' IMPORT INTO t FROM 'gs://import/test.csv?credentials-file=${credentials-file-path}'; ``` -- Import data files from Azure Blob Storage: - - ```sql - IMPORT INTO t FROM 'azure://import/test.csv?credentials-file=${credentials-file-path}'; - ``` - -For details about the URI path configuration for Amazon S3, GCS, or Azure Blob Storage, see [URI Formats of External Storage Services](/external-storage-uri.md). +For details about the URI path configuration for Amazon S3 or GCS, see [URI Formats of External Storage Services](/external-storage-uri.md). ### Calculate column values using SetClause diff --git a/tidb-lightning/tidb-lightning-overview.md b/tidb-lightning/tidb-lightning-overview.md index 3bb468287ef08..6e5e20f7f6356 100644 --- a/tidb-lightning/tidb-lightning-overview.md +++ b/tidb-lightning/tidb-lightning-overview.md @@ -18,7 +18,6 @@ TiDB Lightning can read data from the following sources: - Local - [Amazon S3](/external-storage-uri.md#amazon-s3-uri-format) - [Google Cloud Storage](/external-storage-uri.md#gcs-uri-format) -- [Azure Blob Storage](/external-storage-uri.md#azure-blob-storage-uri-format) ## TiDB Lightning architecture diff --git a/tidb-lightning/tidb-lightning-physical-import-mode.md b/tidb-lightning/tidb-lightning-physical-import-mode.md index 1cc3ad215d66c..dd41af8479247 100644 --- a/tidb-lightning/tidb-lightning-physical-import-mode.md +++ b/tidb-lightning/tidb-lightning-physical-import-mode.md @@ -78,7 +78,7 @@ It is recommended that you allocate CPU more than 32 cores and memory greater th - Do not use multiple TiDB Lightning instances to import data to the same TiDB cluster by default. Use [Parallel Import](/tidb-lightning/tidb-lightning-distributed-import.md) instead. - When you use multiple TiDB Lightning to import data to the same target cluster, do not mix the import modes. That is, do not use the physical import mode and the logical import mode at the same time. - During the process of importing data, do not perform DDL and DML operations in the target table. Otherwise the import will fail or the data will be inconsistent. At the same time, it is not recommended to perform read operations, because the data you read might be inconsistent. You can perform read and write operations after the import operation is completed. -- A single Lightning process can import a single table of 10 TB at most. Parallel import can use 10 Lightning instances at most. +- A single Lightning process can import a single table of 10 TiB at most. Parallel import can use 10 Lightning instances at most. ### Tips for using with other components From 0465c730ccf4b9ccf66cda6fffd4fca17ca7fae9 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Wed, 27 Dec 2023 10:19:07 +0800 Subject: [PATCH 2/2] Update sql-statement-import-into.md --- sql-statements/sql-statement-import-into.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/sql-statements/sql-statement-import-into.md b/sql-statements/sql-statement-import-into.md index 28d60238c04a1..5469dc9704b8e 100644 --- a/sql-statements/sql-statement-import-into.md +++ b/sql-statements/sql-statement-import-into.md @@ -19,11 +19,7 @@ This TiDB statement is not applicable to TiDB Cloud. `IMPORT INTO` supports importing data from files stored in Amazon S3, GCS, and the TiDB local storage. -<<<<<<< HEAD -- For data files stored in Amazon S3, GCS, or Azure Blob Storage, `IMPORT INTO` supports running in the [TiDB backend task distributed execution framework](/tidb-distributed-execution-framework.md). -======= - For data files stored in Amazon S3 or GCS, `IMPORT INTO` supports running in the [TiDB Distributed eXecution Framework (DXF)](/tidb-distributed-execution-framework.md). ->>>>>>> 7752b8ad60 (Removed unsupported Azure Blob storage via import into (#15859)) - When this framework is enabled ([tidb_enable_dist_task](/system-variables.md#tidb_enable_dist_task-new-in-v710) is `ON`), `IMPORT INTO` splits a data import job into multiple sub-jobs and distributes these sub-jobs to different TiDB nodes for execution to improve the import efficiency. - When this framework is disabled, `IMPORT INTO` only supports running on the TiDB node where the current user is connected.