Skip to content

Commit

Permalink
hide the config global-kill (#5075) (#5126)
Browse files Browse the repository at this point in the history
  • Loading branch information
ti-srebot authored Mar 26, 2021
1 parent 2b414a8 commit b0ae70d
Show file tree
Hide file tree
Showing 5 changed files with 40 additions and 76 deletions.
2 changes: 1 addition & 1 deletion explain-joins.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Query OK, 0 rows affected (3.65 sec)
### Variations of Index Join

An index join operation using the hint [`INL_JOIN`](/optimizer-hints.md#inl_joint1_name--tl_name-) creates a hash table of the intermediate results before joining on the outer table. TiDB also supports creating a hash table on the outer table using the hint [`INL_HASH_JOIN`](/optimizer-hints.md#inl_hash_join). If the column sets on the inner table match the columns of the outer table, the [`INL_MERGE_JOIN`](/optimizer-hints.md#inl_merge_join) index join can apply. Each of these variations of index join is automatically selected by the SQL Optimizer.
An index join operation using the hint [`INL_JOIN`](/optimizer-hints.md#inl_joint1_name--tl_name-) creates a hash table of the intermediate results before joining on the outer table. TiDB also supports creating a hash table on the outer table using the hint [`INL_HASH_JOIN`](/optimizer-hints.md#inl_hash_join). Each of these variations of index join is automatically selected by the SQL Optimizer.

### Configuration

Expand Down
73 changes: 38 additions & 35 deletions explain-subqueries.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,21 +54,22 @@ EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t2);
```

```sql
+--------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------+
| IndexMergeJoin_19 | 45.00 | root | | inner join, inner:TableReader_14, outer key:test.t2.t1_id, inner key:test.t1.id |
| ├─HashAgg_38(Build) | 45.00 | root | | group by:test.t2.t1_id, funcs:firstrow(test.t2.t1_id)->test.t2.t1_id |
| │ └─IndexReader_39 | 45.00 | root | | index:HashAgg_31 |
| │ └─HashAgg_31 | 45.00 | cop[tikv] | | group by:test.t2.t1_id, |
| │ └─IndexFullScan_37 | 90000.00 | cop[tikv] | table:t2, index:t1_id(t1_id) | keep order:false |
| └─TableReader_14(Probe) | 1.00 | root | | data:TableRangeScan_13 |
| └─TableRangeScan_13 | 1.00 | cop[tikv] | table:t1 | range: decided by [test.t2.t1_id], keep order:true |
+--------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------+
7 rows in set (0.00 sec)
+----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_14 | 5.00 | root | | inner join, inner:IndexLookUp_13, outer key:test.t2.t1_id, inner key:test.t1.id, equal cond:eq(test.t2.t1_id, test.t1.id) |
| ├─StreamAgg_49(Build) | 5.00 | root | | group by:test.t2.t1_id, funcs:firstrow(test.t2.t1_id)->test.t2.t1_id |
| │ └─IndexReader_50 | 5.00 | root | | index:StreamAgg_39 |
| │ └─StreamAgg_39 | 5.00 | cop[tikv] | | group by:test.t2.t1_id, |
| │ └─IndexFullScan_31 | 50000.00 | cop[tikv] | table:t2, index:t1_id(t1_id) | keep order:true |
| └─IndexLookUp_13(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_11(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t2.t1_id)], keep order:false |
| └─TableRowIDScan_12(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
+----------------------------------+----------+-----------+------------------------------+---------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.00 sec)
```

The result above shows that TiDB performs an index join operation (merge variant) that starts by reading the index on `t2.t1_id`. The values of `t1_id` are deduplicated inside TiKV first as a part of the `└─HashAgg_31` operator task, and then deduplicated again in TiDB as a part of the `├─HashAgg_38(Build)` operator task. The deduplication is performed by the aggregation function `firstrow(test.t2.t1_id)`. The result is then joined against the `t1` table's `PRIMARY KEY`.
The result above shows that TiDB performs an index join operation that starts by reading the index on `t2.t1_id`. The values of `t1_id` are deduplicated inside TiKV first as a part of the `└─HashAgg_31` operator task, and then deduplicated again in TiDB as a part of the `├─HashAgg_38(Build)` operator task. The deduplication is performed by the aggregation function `firstrow(test.t2.t1_id)`. The result is then joined against the `t1` table's `PRIMARY KEY`.

## Inner join (unique subquery)

Expand All @@ -79,23 +80,24 @@ EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t3);
```

```sql
+-----------------------------+---------+-----------+------------------------------+---------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+---------+-----------+------------------------------+---------------------------------------------------------------------------------+
| IndexMergeJoin_20 | 999.00 | root | | inner join, inner:TableReader_15, outer key:test.t3.t1_id, inner key:test.t1.id |
| ├─IndexReader_39(Build) | 999.00 | root | | index:IndexFullScan_38 |
| │ └─IndexFullScan_38 | 999.00 | cop[tikv] | table:t3, index:t1_id(t1_id) | keep order:false |
| └─TableReader_15(Probe) | 1.00 | root | | data:TableRangeScan_14 |
| └─TableRangeScan_14 | 1.00 | cop[tikv] | table:t1 | range: decided by [test.t3.t1_id], keep order:true |
+-----------------------------+---------+-----------+------------------------------+---------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
+----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_17 | 1978.13 | root | | inner join, inner:IndexLookUp_16, outer key:test.t3.t1_id, inner key:test.t1.id, equal cond:eq(test.t3.t1_id, test.t1.id) |
| ├─TableReader_44(Build) | 1978.00 | root | | data:TableFullScan_43 |
| │ └─TableFullScan_43 | 1978.00 | cop[tikv] | table:t3 | keep order:false |
| └─IndexLookUp_16(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_14(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t3.t1_id)], keep order:false |
| └─TableRowIDScan_15(Probe) | 1.00 | cop[tikv] | table:t1 | keep order:false |
+----------------------------------+---------+-----------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)
```

Semantically because `t3.t1_id` is guaranteed unique, it can be executed directly as an `INNER JOIN`.

## Semi join (correlated subquery)

In the previous two examples, TiDB is able to perform an `INNER JOIN` operation after the data inside the subquery is made unique (via `HashAgg`) or guaranteed unique. Both joins are performed using an Index Join (merge variant).
In the previous two examples, TiDB is able to perform an `INNER JOIN` operation after the data inside the subquery is made unique (via `HashAgg`) or guaranteed unique. Both joins are performed using an Index Join.

In this example, TiDB chooses a different execution plan:

Expand All @@ -104,17 +106,18 @@ EXPLAIN SELECT * FROM t1 WHERE id IN (SELECT t1_id FROM t2 WHERE t1_id != t1.int
```

```sql
+-----------------------------+-----------+-----------+------------------------------+--------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+-----------+-----------+------------------------------+--------------------------------------------------------------------------------------------------------+
| MergeJoin_9 | 45446.40 | root | | semi join, left key:test.t1.id, right key:test.t2.t1_id, other cond:ne(test.t2.t1_id, test.t1.int_col) |
| ├─IndexReader_24(Build) | 180000.00 | root | | index:IndexFullScan_23 |
| │ └─IndexFullScan_23 | 180000.00 | cop[tikv] | table:t2, index:t1_id(t1_id) | keep order:true |
| └─TableReader_22(Probe) | 56808.00 | root | | data:Selection_21 |
| └─Selection_21 | 56808.00 | cop[tikv] | | ne(test.t1.id, test.t1.int_col) |
| └─TableFullScan_20 | 71010.00 | cop[tikv] | table:t1 | keep order:true |
+-----------------------------+-----------+-----------+------------------------------+--------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)
+----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_14 | 1582.40 | root | | anti semi join, inner:IndexLookUp_13, outer key:test.t3.t1_id, inner key:test.t1.id, equal cond:eq(test.t3.t1_id, test.t1.id) |
| ├─TableReader_35(Build) | 1978.00 | root | | data:TableFullScan_34 |
| │ └─TableFullScan_34 | 1978.00 | cop[tikv] | table:t3 | keep order:false |
| └─IndexLookUp_13(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_10(Build) | 1.00 | cop[tikv] | table:t1, index:PRIMARY(id) | range: decided by [eq(test.t1.id, test.t3.t1_id)], keep order:false |
| └─Selection_12(Probe) | 1.00 | cop[tikv] | | lt(test.t1.int_col, 100) |
| └─TableRowIDScan_11 | 1.00 | cop[tikv] | table:t1 | keep order:false |
+----------------------------------+---------+-----------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)
```

From the result above, you can see that TiDB uses a `Semi Join` algorithm. Semi-join differs from inner join: semi-join only permits the first value on the right key (`t2.t1_id`), which means that the duplicates are eliminated as a part of the join operator task. The join algorithm is also Merge Join, which is like an efficient zipper-merge as the operator reads data from both the left and the right side in sorted order.
Expand Down
4 changes: 0 additions & 4 deletions optimizer-hints.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,6 @@ The parameter(s) given in `INL_JOIN()` is the candidate table for the inner tabl

The `INL_HASH_JOIN(t1_name [, tl_name])` hint tells the optimizer to use the index nested loop hash join algorithm. The conditions for using this algorithm are the same with the conditions for using the index nested loop join algorithm. The difference between the two algorithms is that `INL_JOIN` creates a hash table on the joined inner table, but `INL_HASH_JOIN` creates a hash table on the joined outer table. `INL_HASH_JOIN` has a fixed limit on memory usage, while the memory used by `INL_JOIN` depends on the number of rows matched in the inner table.

### INL_MERGE_JOIN

The `INL_MERGE_JOIN(t1_name [, tl_name])` hint tells the optimizer to use the index nested loop merge join algorithm. This hint is used in the same scenario as in that of `INL_JOIN`. Compared with `INL_JOIN` and `INL_HASH_JOIN`, it saves more memory but requires more strict usage conditions: the column sets of the inner table in join keys is the prefix of the inner table index, or the index of the inner table is the prefix of the column sets of the inner table in join keys.

### HASH_JOIN(t1_name [, tl_name ...])

The `HASH_JOIN(t1_name [, tl_name ...])` hint tells the optimizer to use the hash join algorithm for the given table(s). This algorithm allows the query to be executed concurrently with multiple threads, which achieves a higher processing speed but consumes more memory. For example:
Expand Down
Loading

0 comments on commit b0ae70d

Please sign in to comment.