[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list #30888

yaooqinn · 2020-12-22T11:35:27Z

We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (#29893)). So, this PR targets at documenting it in the SQL documents.

What changes were proposed in this pull request?

improve doc

Why are the changes needed?

Does this PR introduce any user-facing change?

doc

How was this patch tested?

passing GA doc gen.

yaooqinn · 2020-12-22T11:39:50Z

cc @maropu @HyukjinKwon @gatorsmile @cloud-fan thanks for reviewing, and thanks @maropu for the JIRA ticket for kindly reminding

SparkQA · 2020-12-22T11:54:21Z

Test build #133215 has finished for PR 30888 at commit bb7f271.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-12-22T12:31:19Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37813/

maropu

Thanks for updating it, @yaooqinn !

maropu · 2020-12-22T12:17:51Z

docs/sql-ref-syntax-dml-insert-into.md

@@ -26,7 +26,7 @@ The `INSERT INTO` statement inserts new rows into a table. The inserted rows can
 ### Syntax

 ```sql
-INSERT INTO [ TABLE ] table_identifier [ partition_spec ]
+INSERT INTO [ TABLE ] table_identifier [ partition_spec ] [ column_list ]


[ column_list ] -> [ ( column_list ) ]

maropu · 2020-12-22T12:20:45Z

docs/sql-ref-syntax-dml-insert-into.md

@@ -45,6 +45,13 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ]

    **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] )`

+* **column_list**
+
+  An optional parameter that specifies a comma separated list of columns belong to the `table_identifier`.


nit:

comma separated -> comma-separated

belong -> belonging

the table_identifier -> the table_identifier table

maropu · 2020-12-22T12:40:03Z

docs/sql-ref-syntax-dml-insert-into.md

+  An optional parameter that specifies a comma separated list of columns belong to the `table_identifier`.
+  All specified columns should exist in the `table_identifier` and not be duplicated from each other. It includes all columns except the static partition columns.
+  The size of the column list should be exactly the size of the data from `VALUES` clause or query.
+  The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position.


How about organizing it like this (it seems some statements above describe the current limitations):

* **colunn_list** <param description> **Note** The current behaviour has some limitations: 1. The column list should contain all the column names in the `table_identifier` table. 2. ...

(ref: https://github.com/apache/spark/blame/master/docs/sql-ref-syntax-qry-select-having.md#L43-L48)

maropu · 2020-12-22T12:40:41Z

docs/sql-ref-syntax-dml-insert-into.md

+------------+----------------------+----------+
+|Kent Yao Jr.|Hangzhou, China       |  11215017|
+------------+----------------------+----------+
+


nit: remove this blank.

SparkQA · 2020-12-22T13:01:41Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37813/

yaooqinn · 2020-12-22T13:09:59Z

comments addressed thank you @maropu

SparkQA · 2020-12-22T13:44:59Z

Test build #133218 has finished for PR 30888 at commit 5d22049.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-12-22T14:37:43Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37816/

SparkQA · 2020-12-22T15:11:24Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37816/

dongjoon-hyun · 2020-12-22T23:39:31Z

docs/sql-ref-syntax-dml-insert-into.md

+
+```sql
+INSERT INTO students PARTITION (student_id = 11215017) (address, name) VALUES
+    ('Hangzhou, China', 'Kent Yao Jr.');


😄 Kent Yao Jr.

maropu

cc: @cloud-fan @HyukjinKwon

docs/sql-ref-syntax-dml-insert-overwrite-table.md

maropu · 2020-12-22T23:48:34Z

For reviews, could you put the screenshot of the updated doc in the PR description?

We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (#29893)). So, this PR targets at documenting it in the SQL documents. ### What changes were proposed in this pull request? improve doc ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? doc ### How was this patch tested? passing GA doc gen. ![image](https://user-images.githubusercontent.com/8326978/102954876-8994fa00-450f-11eb-81f9-931af6d1f69b.png) ![image](https://user-images.githubusercontent.com/8326978/102954900-99acd980-450f-11eb-9733-115ad37d2319.png) ![image](https://user-images.githubusercontent.com/8326978/102954935-af220380-450f-11eb-9aaa-fdae0725d41e.png) ![image](https://user-images.githubusercontent.com/8326978/102954949-bc3ef280-450f-11eb-8a0d-d7b688efa7bb.png) Closes #30888 from yaooqinn/SPARK-33877. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit a3dd8da) Signed-off-by: Dongjoon Hyun <[email protected]>

dongjoon-hyun · 2020-12-23T03:47:05Z

Merged to master/3.1. Thank you, @yaooqinn and @maropu !

SparkQA · 2020-12-23T03:51:44Z

Test build #133258 has finished for PR 30888 at commit 8249fc5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-12-23T04:22:42Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37856/

SparkQA · 2020-12-23T04:48:22Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37856/

cloud-fan · 2020-12-23T05:19:38Z

docs/sql-ref-syntax-dml-insert-into.md

+    **Note:**The current behaviour has some limitations:
+    - All specified columns should exist in the table and not be duplicated from each other. It includes all columns except the static partition columns.
+    - The size of the column list should be exactly the size of the data from `VALUES` clause or query.
+    - The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position.


is it a limitation?

Yeah, It more sounds just like describing its behaviour instead of a limitation.

how about removing: The current behavior has some limitations:

Can we move the last point to the description?

An optional parameter that specifies .... Spark will reorder the columns of the input query to match the table schema according to the specified column list.

I made #30909

…column list ### What changes were proposed in this pull request? followup of a3dd8da via suggestion #30888 (comment) ### Why are the changes needed? doc improvement ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing GA doc Closes #30909 from yaooqinn/SPARK-33877-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…column list ### What changes were proposed in this pull request? followup of a3dd8da via suggestion #30888 (comment) ### Why are the changes needed? doc improvement ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing GA doc Closes #30909 from yaooqinn/SPARK-33877-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 368a2c3) Signed-off-by: Dongjoon Hyun <[email protected]>

We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (apache#29893)). So, this PR targets at documenting it in the SQL documents. ### What changes were proposed in this pull request? improve doc ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? doc ### How was this patch tested? passing GA doc gen. ![image](https://user-images.githubusercontent.com/8326978/102954876-8994fa00-450f-11eb-81f9-931af6d1f69b.png) ![image](https://user-images.githubusercontent.com/8326978/102954900-99acd980-450f-11eb-9733-115ad37d2319.png) ![image](https://user-images.githubusercontent.com/8326978/102954935-af220380-450f-11eb-9aaa-fdae0725d41e.png) ![image](https://user-images.githubusercontent.com/8326978/102954949-bc3ef280-450f-11eb-8a0d-d7b688efa7bb.png) Closes apache#30888 from yaooqinn/SPARK-33877. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…column list ### What changes were proposed in this pull request? followup of apache@a3dd8da via suggestion apache#30888 (comment) ### Why are the changes needed? doc improvement ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing GA doc Closes apache#30909 from yaooqinn/SPARK-33877-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list

bb7f271

github-actions bot added the DOCS label Dec 22, 2020

maropu reviewed Dec 22, 2020

View reviewed changes

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list

5d22049

dongjoon-hyun reviewed Dec 22, 2020

View reviewed changes

maropu approved these changes Dec 22, 2020

View reviewed changes

dongjoon-hyun reviewed Dec 22, 2020

View reviewed changes

docs/sql-ref-syntax-dml-insert-overwrite-table.md Outdated Show resolved Hide resolved

dongjoon-hyun reviewed Dec 22, 2020

View reviewed changes

docs/sql-ref-syntax-dml-insert-overwrite-table.md Outdated Show resolved Hide resolved

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list

8249fc5

dongjoon-hyun approved these changes Dec 23, 2020

View reviewed changes

dongjoon-hyun closed this in a3dd8da Dec 23, 2020

cloud-fan reviewed Dec 23, 2020

View reviewed changes

yaooqinn mentioned this pull request Dec 23, 2020

[SPARK-33877][SQL][FOLLOWUP] SQL reference documents for INSERT w/ a column list #30909

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list #30888

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list #30888

yaooqinn commented Dec 22, 2020 •

edited

Loading

yaooqinn commented Dec 22, 2020

SparkQA commented Dec 22, 2020

SparkQA commented Dec 22, 2020

maropu left a comment

maropu Dec 22, 2020

maropu Dec 22, 2020

maropu Dec 22, 2020

maropu Dec 22, 2020

SparkQA commented Dec 22, 2020

yaooqinn commented Dec 22, 2020

SparkQA commented Dec 22, 2020

SparkQA commented Dec 22, 2020

SparkQA commented Dec 22, 2020

dongjoon-hyun Dec 22, 2020

maropu left a comment

maropu commented Dec 22, 2020

dongjoon-hyun commented Dec 23, 2020

SparkQA commented Dec 23, 2020

SparkQA commented Dec 23, 2020

SparkQA commented Dec 23, 2020

cloud-fan Dec 23, 2020

HyukjinKwon Dec 23, 2020

yaooqinn Dec 23, 2020

cloud-fan Dec 23, 2020

yaooqinn Dec 23, 2020

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list #30888

[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list #30888

Conversation

yaooqinn commented Dec 22, 2020 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

yaooqinn commented Dec 22, 2020

SparkQA commented Dec 22, 2020

SparkQA commented Dec 22, 2020

maropu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Dec 22, 2020

yaooqinn commented Dec 22, 2020

SparkQA commented Dec 22, 2020

SparkQA commented Dec 22, 2020

SparkQA commented Dec 22, 2020

Choose a reason for hiding this comment

maropu left a comment

Choose a reason for hiding this comment

maropu commented Dec 22, 2020

dongjoon-hyun commented Dec 23, 2020

SparkQA commented Dec 23, 2020

SparkQA commented Dec 23, 2020

SparkQA commented Dec 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yaooqinn commented Dec 22, 2020 •

edited

Loading