Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hide the config global-kill #5770

Merged
merged 5 commits into from
Mar 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion explain-joins.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ Query OK, 0 rows affected (3.65 sec)
### Index Join 相关算法

如果使用 Hint [`INL_JOIN`](/optimizer-hints.md#inl_joint1_name--tl_name-) 进行 Index Join 操作,TiDB 会在连接外表之前创建一个中间结果的 Hash Table。TiDB 同样也支持使用 Hint [`INL_HASH_JOIN`](/optimizer-hints.md#inl_hash_join) 在外表上建 Hash Table。而如果内表列集合与外表的相匹配,则可以应用 Hint [`INL_MERGE_JOIN`](/optimizer-hints.md#inl_merge_join)以上所述的 Index Join 相关算法都由 SQL 优化器自动选择。
如果使用 Hint [`INL_JOIN`](/optimizer-hints.md#inl_joint1_name--tl_name-) 进行 Index Join 操作,TiDB 会在连接外表之前创建一个中间结果的 Hash Table。TiDB 同样也支持使用 Hint [`INL_HASH_JOIN`](/optimizer-hints.md#inl_hash_join) 在外表上建 Hash Table。以上所述的 Index Join 相关算法都由 SQL 优化器自动选择。

### 配置

Expand Down
20 changes: 0 additions & 20 deletions explain-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,6 @@ TiDB 的 Join 算法包括如下几类:
- Merge Join
- Index Join (Index Nested Loop Join)
- Index Hash Join (Index Nested Loop Hash Join)
- Index Merge Join (Index Nested Loop Merge Join)

下面分别通过一些例子来解释这些 Join 算法的执行过程。

Expand Down Expand Up @@ -322,25 +321,6 @@ mysql> EXPLAIN SELECT /*+ INL_HASH_JOIN(t1, t2) */ * FROM t1, t2 WHERE t1.id = t
6 rows in set (0.00 sec)
```

#### Index Merge Join 示例
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also hide it in explain-subqueries.md

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in optimizer-hints.md


该算法的使用条件包含 Index Join 的所有使用条件,但还需要添加一条:join keys 中的内表列集合是内表使用的 index 的前缀,或内表使用的 index 是 join keys 中的内表列集合的前缀,该算法相比于 INL_JOIN 会更节省内存。

```
mysql> EXPLAIN SELECT /*+ INL_MERGE_JOIN(t1, t2) */ * FROM t1, t2 WHERE t1.id = t2.id;
+-----------------------------+----------+-----------+------------------------+-------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+----------+-----------+------------------------+-------------------------------------------------------------------------------+
| IndexMergeJoin_16 | 12487.50 | root | | inner join, inner:IndexReader_14, outer key:test.t1.id, inner key:test.t2.id |
| ├─IndexReader_31(Build) | 9990.00 | root | | index:IndexFullScan_30 |
| │ └─IndexFullScan_30 | 9990.00 | cop[tikv] | table:t1, index:id(id) | keep order:false, stats:pseudo |
| └─IndexReader_14(Probe) | 1.00 | root | | index:Selection_13 |
| └─Selection_13 | 1.00 | cop[tikv] | | not(isnull(test.t2.id)) |
| └─IndexRangeScan_12 | 1.00 | cop[tikv] | table:t2, index:id(id) | range: decided by [eq(test.t2.id, test.t1.id)], keep order:true, stats:pseudo |
+-----------------------------+----------+-----------+------------------------+-------------------------------------------------------------------------------+
6 rows in set (0.00 sec)
```

## 优化实例

使用 [bikeshare example database](https://github.com/pingcap/docs/blob/master/import-example-data.md):
Expand Down
73 changes: 38 additions & 35 deletions explain-subqueries.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,21 +54,22 @@ EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t2);
```

```sql
+--------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------+
| IndexMergeJoin_19 | 45.00 | root | | inner join, inner:TableReader_14, outer key:test.t2.t1_id, inner key:test.t1.id |
| ├─HashAgg_38(Build) | 45.00 | root | | group by:test.t2.t1_id, funcs:firstrow(test.t2.t1_id)->test.t2.t1_id |
| │ └─IndexReader_39 | 45.00 | root | | index:HashAgg_31 |
| │ └─HashAgg_31 | 45.00 | cop[tikv] | | group by:test.t2.t1_id, |
| │ └─IndexFullScan_37 | 90000.00 | cop[tikv] | table:t2, index:t1_id(t1_id) | keep order:false |
| └─TableReader_14(Probe) | 1.00 | root | | data:TableRangeScan_13 |
| └─TableRangeScan_13 | 1.00 | cop[tikv] | table:t1 | range: decided by [test.t2.t1_id], keep order:true |
+--------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------+
7 rows in set (0.00 sec)
+----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_14 | 5.00 | root | | inner join, inner:IndexLookUp_13, outer key:test.t2.t1_id, inner key:test.t1.id, equal cond:eq(test.t2.t1_id, test.t1.id) |
| ├─StreamAgg_49(Build) | 5.00 | root | | group by:test.t2.t1_id, funcs:firstrow(test.t2.t1_id)->test.t2.t1_id |
| │ └─IndexReader_50 | 5.00 | root | | index:StreamAgg_39 |
| │ └─StreamAgg_39 | 5.00 | cop[tikv] | | group by:test.t2.t1_id, |
| │ └─IndexFullScan_31 | 50000.00 | cop[tikv] | table:t2, index:t1_id(t1_id) | keep order:true |
| └─IndexLookUp_13(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_11(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t2.t1_id)], keep order:false |
| └─TableRowIDScan_12(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
+----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.00 sec)
```

由上述查询结果可知,TiDB 首先执行 `Index Join` 索引连接(即 `Merge Join` 合并连接的变体)操作,开始读取 `t2.t1_id` 列的索引。先是 `└─HashAgg_31` 算子的部分任务在 TiKV 中对 `t1_id` 值进行去重,然后`├─HashAgg_38(Build)` 算子的部分任务在 TiDB 中对 `t1_id` 值再次进行去重。去重操作由聚合函数 `firstrow(test.t2.t1_id)` 执行,之后会将操作结果与 `t1` 表的主键相连接。
由上述查询结果可知,TiDB 首先执行 `Index Join` 索引连接操作,开始读取 `t2.t1_id` 列的索引。先是 `└─HashAgg_31` 算子的部分任务在 TiKV 中对 `t1_id` 值进行去重,然后`├─HashAgg_38(Build)` 算子的部分任务在 TiDB 中对 `t1_id` 值再次进行去重。去重操作由聚合函数 `firstrow(test.t2.t1_id)` 执行,之后会将操作结果与 `t1` 表的主键相连接。

## Inner join(有 `UNIQUE` 约束的子查询)

Expand All @@ -79,23 +80,24 @@ EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t3);
```

```sql
+-----------------------------+---------+-----------+------------------------------+---------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+---------+-----------+------------------------------+---------------------------------------------------------------------------------+
| IndexMergeJoin_20 | 999.00 | root | | inner join, inner:TableReader_15, outer key:test.t3.t1_id, inner key:test.t1.id |
| ├─IndexReader_39(Build) | 999.00 | root | | index:IndexFullScan_38 |
| │ └─IndexFullScan_38 | 999.00 | cop[tikv] | table:t3, index:t1_id(t1_id) | keep order:false |
| └─TableReader_15(Probe) | 1.00 | root | | data:TableRangeScan_14 |
| └─TableRangeScan_14 | 1.00 | cop[tikv] | table:t1 | range: decided by [test.t3.t1_id], keep order:true |
+-----------------------------+---------+-----------+------------------------------+---------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
+----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_17 | 1978.13 | root | | inner join, inner:IndexLookUp_16, outer key:test.t3.t1_id, inner key:test.t1.id, equal cond:eq(test.t3.t1_id, test.t1.id) |
| ├─TableReader_44(Build) | 1978.00 | root | | data:TableFullScan_43 |
| │ └─TableFullScan_43 | 1978.00 | cop[tikv] | table:t3 | keep order:false |
| └─IndexLookUp_16(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_14(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t3.t1_id)], keep order:false |
| └─TableRowIDScan_15(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
+----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)
```

从语义上看,因为约束保证了 `t3.t1_id` 列值的唯一性,TiDB 可以直接执行 `INNER JOIN` 查询。

## Semi Join(关联查询)

在前两个示例中,通过 `HashAgg` 聚合操作或通过 `UNIQUE` 约束保证子查询数据的唯一性之后,TiDB 才能够执行 `Inner Join` 操作。这两种连接均使用了 `Index Join`(`Merge Join` 的变体)
在前两个示例中,通过 `HashAgg` 聚合操作或通过 `UNIQUE` 约束保证子查询数据的唯一性之后,TiDB 才能够执行 `Inner Join` 操作。这两种连接均使用了 `Index Join`。

下面的例子中,TiDB 优化器则选择了一种不同的执行计划:

Expand Down Expand Up @@ -130,17 +132,18 @@ EXPLAIN SELECT * FROM t3 WHERE t1_id NOT IN (SELECT id FROM t1 WHERE int_col < 1
```

```sql
+-----------------------------+---------+-----------+---------------+-------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+---------+-----------+---------------+-------------------------------------------------------------------------------------+
| IndexMergeJoin_20 | 1598.40 | root | | anti semi join, inner:TableReader_15, outer key:test.t3.t1_id, inner key:test.t1.id |
| ├─TableReader_28(Build) | 1998.00 | root | | data:TableFullScan_27 |
| │ └─TableFullScan_27 | 1998.00 | cop[tikv] | table:t3 | keep order:false |
| └─TableReader_15(Probe) | 1.00 | root | | data:Selection_14 |
| └─Selection_14 | 1.00 | cop[tikv] | | lt(test.t1.int_col, 100) |
| └─TableRangeScan_13 | 1.00 | cop[tikv] | table:t1 | range: decided by [test.t3.t1_id], keep order:true |
+-----------------------------+---------+-----------+---------------+-------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)
+----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_14 | 1582.40 | root | | anti semi join, inner:IndexLookUp_13, outer key:test.t3.t1_id, inner key:test.t1.id, equal cond:eq(test.t3.t1_id, test.t1.id) |
| ├─TableReader_35(Build) | 1978.00 | root | | data:TableFullScan_34 |
| │ └─TableFullScan_34 | 1978.00 | cop[tikv] | table:t3 | keep order:false |
| └─IndexLookUp_13(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_10(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t3.t1_id)], keep order:false |
| └─Selection_12(Probe) | 1.00 | cop[tikv] | | lt(test.t1.int_col, 100) |
| └─TableRowIDScan_11 | 1.00 | cop[tikv] | table:t1 | keep order:false |
+----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)
```

上述查询首先读取了表 `t3`,然后根据主键开始探测 (probe) 表 `t1`。连接类型是 _anti semi join_,即反半连接:之所以使用 _anti_,是因为上述示例有不存在匹配值(即 `NOT IN`)的情况;使用 `Semi Join` 则是因为仅需要匹配第一行后就可以停止查询。
4 changes: 0 additions & 4 deletions optimizer-hints.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,10 +111,6 @@ SELECT /*+ INL_JOIN(t1, t2) */ * FROM t1,t2 WHERE t1.id = t2.id;

`INL_HASH_JOIN(t1_name [, tl_name])` 提示优化器使用 Index Nested Loop Hash Join 算法。该算法与 Index Nested Loop Join 使用条件完全一样,两者的区别是 `INL_JOIN` 会在连接的内表上建哈希表,而 `INL_HASH_JOIN` 会在连接的外表上建哈希表,后者对于内存的使用是有固定上限的,而前者使用的内存使用取决于内表匹配到的行数。

### INL_MERGE_JOIN

`INL_MERGE_JOIN(t1_name [, tl_name])` 提示优化器使用 Index Nested Loop Merge Join 算法。这个 Hint 的适用场景和 `INL_JOIN` 一致,相比于 `INL_JOIN``INL_HASH_JOIN` 会更节省内存,但使用条件会更苛刻:join keys 中的内表列集合是内表使用的索引的前缀,或内表使用的索引是 join keys 中的内表列集合的前缀。

### HASH_JOIN(t1_name [, tl_name ...])

`HASH_JOIN(t1_name [, tl_name ...])` 提示优化器对指定表使用 Hash Join 算法。这个算法多线程并发执行,执行速度较快,但会消耗较多内存。例如:
Expand Down
Loading