-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list #30888
Conversation
cc @maropu @HyukjinKwon @gatorsmile @cloud-fan thanks for reviewing, and thanks @maropu for the JIRA ticket for kindly reminding |
Test build #133215 has finished for PR 30888 at commit
|
Kubernetes integration test starting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating it, @yaooqinn !
@@ -26,7 +26,7 @@ The `INSERT INTO` statement inserts new rows into a table. The inserted rows can | |||
### Syntax | |||
|
|||
```sql | |||
INSERT INTO [ TABLE ] table_identifier [ partition_spec ] | |||
INSERT INTO [ TABLE ] table_identifier [ partition_spec ] [ column_list ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ column_list ]
-> [ ( column_list ) ]
@@ -45,6 +45,13 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ] | |||
|
|||
**Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] )` | |||
|
|||
* **column_list** | |||
|
|||
An optional parameter that specifies a comma separated list of columns belong to the `table_identifier`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
comma separated
->comma-separated
belong
->belonging
the table_identifier
->the table_identifier table
An optional parameter that specifies a comma separated list of columns belong to the `table_identifier`. | ||
All specified columns should exist in the `table_identifier` and not be duplicated from each other. It includes all columns except the static partition columns. | ||
The size of the column list should be exactly the size of the data from `VALUES` clause or query. | ||
The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about organizing it like this (it seems some statements above describe the current limitations):
* **colunn_list**
<param description>
**Note**
The current behaviour has some limitations:
1. The column list should contain all the column names in the `table_identifier` table.
2. ...
+------------+----------------------+----------+ | ||
|Kent Yao Jr.|Hangzhou, China | 11215017| | ||
+------------+----------------------+----------+ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this blank.
Kubernetes integration test status success |
comments addressed thank you @maropu |
Test build #133218 has finished for PR 30888 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
|
||
```sql | ||
INSERT INTO students PARTITION (student_id = 11215017) (address, name) VALUES | ||
('Hangzhou, China', 'Kent Yao Jr.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 Kent Yao Jr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviews, could you put the screenshot of the updated doc in the PR description? |
We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (#29893)). So, this PR targets at documenting it in the SQL documents. ### What changes were proposed in this pull request? improve doc ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? doc ### How was this patch tested? passing GA doc gen.     Closes #30888 from yaooqinn/SPARK-33877. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit a3dd8da) Signed-off-by: Dongjoon Hyun <[email protected]>
Test build #133258 has finished for PR 30888 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
**Note:**The current behaviour has some limitations: | ||
- All specified columns should exist in the table and not be duplicated from each other. It includes all columns except the static partition columns. | ||
- The size of the column list should be exactly the size of the data from `VALUES` clause or query. | ||
- The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, It more sounds just like describing its behaviour instead of a limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about removing: The current behavior has some limitations:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the last point to the description?
An optional parameter that specifies .... Spark will reorder the columns of the input query to match the table schema according to the specified column list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made #30909
…column list ### What changes were proposed in this pull request? followup of a3dd8da via suggestion #30888 (comment) ### Why are the changes needed? doc improvement ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing GA doc Closes #30909 from yaooqinn/SPARK-33877-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…column list ### What changes were proposed in this pull request? followup of a3dd8da via suggestion #30888 (comment) ### Why are the changes needed? doc improvement ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing GA doc Closes #30909 from yaooqinn/SPARK-33877-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 368a2c3) Signed-off-by: Dongjoon Hyun <[email protected]>
We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (apache#29893)). So, this PR targets at documenting it in the SQL documents. ### What changes were proposed in this pull request? improve doc ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? doc ### How was this patch tested? passing GA doc gen.     Closes apache#30888 from yaooqinn/SPARK-33877. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…column list ### What changes were proposed in this pull request? followup of apache@a3dd8da via suggestion apache#30888 (comment) ### Why are the changes needed? doc improvement ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing GA doc Closes apache#30909 from yaooqinn/SPARK-33877-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (#29893)). So, this PR targets at documenting it in the SQL documents.
What changes were proposed in this pull request?
improve doc
Why are the changes needed?
Does this PR introduce any user-facing change?
doc
How was this patch tested?
passing GA doc gen.