infoschema: add plan field to the statement summary tables #14182

djshow832 · 2019-12-23T07:37:55Z

What problem does this PR solve?

Add plan and plan digest to the statement summary tables.
Same SQLs with different plans are summarized in different records in the tables.
Record the SQL and plan that appear at the first time in each summary, not the last time, to increase performance.

Since same SQL may be split into several records, max-stmt-count should increase.

What is changed and how it works?

Add plan_digest and plan fields.
schema_name + digest + prev_sql_digest + plan_digest are combined as a key of the summary map.
Default value of max-stmt-count is increased to 200.
Record the SQL and plan that appear at the first time in each summary, not the last time.

Check List

Tests

Unit test
Manual test (add detailed scripts or steps below)

mysql> select * from performance_schema.events_statements_summary_by_digest where plan != ''\G
*************************** 1. row ***************************
       SUMMARY_BEGIN_TIME: 2019-12-22 16:00:00
         SUMMARY_END_TIME: 2019-12-22 16:30:00
                STMT_TYPE: select
              SCHEMA_NAME:
                   DIGEST: ff1ba2b3cf4f7452291642f6399447979825a310ef25a78f01288580f398a3c9
              DIGEST_TEXT: select table_id , is_index , hist_id , distinct_count , version , null_count , tot_col_size , stats_ver , flag , correlation , last_analyze_pos from mysql . stats_histograms where table_id = ?
              TABLE_NAMES: mysql.stats_histograms
              INDEX_NAMES: stats_histograms:tbl
              SAMPLE_USER: NULL
               EXEC_COUNT: 17
              SUM_LATENCY: 17842186
              MAX_LATENCY: 1417064
              MIN_LATENCY: 697887
              AVG_LATENCY: 1049540
        AVG_PARSE_LATENCY: 91016
        MAX_PARSE_LATENCY: 150406
      AVG_COMPILE_LATENCY: 309173
      MAX_COMPILE_LATENCY: 389786
             COP_TASK_NUM: 34
     AVG_COP_PROCESS_TIME: 0
     MAX_COP_PROCESS_TIME: 0
  MAX_COP_PROCESS_ADDRESS: NULL
        AVG_COP_WAIT_TIME: 0
        MAX_COP_WAIT_TIME: 0
     MAX_COP_WAIT_ADDRESS: NULL
         AVG_PROCESS_TIME: 0
         MAX_PROCESS_TIME: 0
            AVG_WAIT_TIME: 0
            MAX_WAIT_TIME: 0
         AVG_BACKOFF_TIME: 0
         MAX_BACKOFF_TIME: 0
           AVG_TOTAL_KEYS: 0
           MAX_TOTAL_KEYS: 0
       AVG_PROCESSED_KEYS: 0
       MAX_PROCESSED_KEYS: 0
        AVG_PREWRITE_TIME: 0
        MAX_PREWRITE_TIME: 0
          AVG_COMMIT_TIME: 0
          MAX_COMMIT_TIME: 0
   AVG_GET_COMMIT_TS_TIME: 0
   MAX_GET_COMMIT_TS_TIME: 0
  AVG_COMMIT_BACKOFF_TIME: 0
  MAX_COMMIT_BACKOFF_TIME: 0
    AVG_RESOLVE_LOCK_TIME: 0
    MAX_RESOLVE_LOCK_TIME: 0
AVG_LOCAL_LATCH_WAIT_TIME: 0
MAX_LOCAL_LATCH_WAIT_TIME: 0
           AVG_WRITE_KEYS: 0
           MAX_WRITE_KEYS: 0
           AVG_WRITE_SIZE: 0
           MAX_WRITE_SIZE: 0
     AVG_PREWRITE_REGIONS: 0
     MAX_PREWRITE_REGIONS: 0
            AVG_TXN_RETRY: 0
            MAX_TXN_RETRY: 0
        SUM_BACKOFF_TIMES: 0
            BACKOFF_TYPES: NULL
                  AVG_MEM: 16672
                  MAX_MEM: 16672
        AVG_AFFECTED_ROWS: 0
               FIRST_SEEN: 2019-12-22 16:18:25
                LAST_SEEN: 2019-12-22 16:19:13
        QUERY_SAMPLE_TEXT: select table_id, is_index, hist_id, distinct_count, version, null_count, tot_col_size, stats_ver, flag, correlation, last_analyze_pos from mysql.stats_histograms where table_id = 43
         PREV_SAMPLE_TEXT:
              PLAN_DIGEST: db05f9e134185031e52539c303d1930de3cf6e5d4881aee7ef97ba87a9a0a6d9
                     PLAN: 	Projection_4    	root	10	mysql.stats_histograms.table_id, mysql.stats_histograms.is_index, mysql.stats_histograms.hist_id, mysql.stats_histograms.distinct_count, mysql.stats_histograms.version, mysql.stats_histograms.null_count, mysql.stats_histograms.tot_col_size, mysql.stats_histograms.stats_ver, mysql.stats_histograms.flag, mysql.stats_histograms.correlation, mysql.stats_histograms.last_analyze_pos
	└─IndexLookUp_10	root	10
	  ├─IndexScan_8 	cop 	10	table:stats_histograms, index:table_id, is_index, hist_id, range:[43,43], keep order:false, stats:pseudo
	  └─TableScan_9 	cop 	10	table:stats_histograms, keep order:false, stats:pseudo

Code changes

N/A

Side effects

Possible performance regression
Increased code complexity
Breaking backward compatibility

Related changes

Need to cherry-pick to the release branch
Need to update the documentation

Release note

And plan and plan_digest to the statement summary tables.
Change default value of configuration max-stmt-count in [stmt-summary] to 200.

djshow832 · 2019-12-23T07:38:21Z

/bench

sre-bot · 2019-12-23T09:06:03Z

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 8cbacf0d7c4612e6b5fa89836ca170af11eb81b2
+++ tidb: 012a939e38062d51f79191ce401fee29e8c5e5f9
tikv: 6ccfe673d60a19626ba19331500d661390ea305c
pd: 4d65bbefbc6db6e92fee33caa97415274512757a
================================================================================
oltp_update_non_index:
    * QPS: 4720.55 ± 0.15% (std=4.92) delta: 0.16% (p=0.272)
    * Latency p50: 27.11 ± 0.00% (std=0.00) delta: -0.17%
    * Latency p99: 41.10 ± 0.00% (std=0.00) delta: -2.41%
            
oltp_insert:
    * QPS: 4773.11 ± 0.29% (std=10.25) delta: 0.11% (p=0.503)
    * Latency p50: 26.81 ± 0.27% (std=0.05) delta: -0.10%
    * Latency p99: 45.85 ± 7.07% (std=2.24) delta: -0.81%
            
oltp_read_write:
    * QPS: 15155.76 ± 0.07% (std=7.68) delta: 0.13% (p=0.755)
    * Latency p50: 169.23 ± 0.08% (std=0.10) delta: -0.13%
    * Latency p99: 307.67 ± 5.95% (std=11.48) delta: -4.42%
            
oltp_update_index:
    * QPS: 4250.36 ± 0.10% (std=2.85) delta: 0.46% (p=0.275)
    * Latency p50: 30.11 ± 0.09% (std=0.02) delta: -0.47%
    * Latency p99: 53.87 ± 3.62% (std=1.53) delta: 0.04%
            
oltp_point_select:
    * QPS: 38721.92 ± 0.32% (std=85.51) delta: 0.65% (p=0.035)
    * Latency p50: 3.30 ± 0.30% (std=0.01) delta: -0.68%
    * Latency p99: 10.00 ± 0.90% (std=0.09) delta: -0.45%

djshow832 · 2019-12-24T06:59:16Z

/run-all-tests

djshow832 · 2019-12-24T08:15:09Z

/run-all-tests

lonng · 2019-12-24T13:57:38Z

util/stmtsummary/statement_summary.go

 		key.hash = append(key.hash, hack.Slice(key.digest)...)
 		key.hash = append(key.hash, hack.Slice(key.schemaName)...)
 		key.hash = append(key.hash, hack.Slice(key.prevDigest)...)
+		key.hash = append(key.hash, hack.Slice(key.planDigest)...)


The same plan but different row count may cause the hash different, is it expected?

No, row count is not in normalized plan, so plan digest are the same.
Refer to https://github.com/pingcap/tidb/blob/master/util/plancodec/codec.go#L274

Add some tests to cover this case (different plan but the same plan digest).

Addressed. Refer to 00cbf9e

lonng

Rest LGTM

lonng

LGTM
(Prefer to use a variable to store the failpoint name)

djshow832 · 2019-12-25T12:50:50Z

/run-unit-test

crazycs520

LGTM.
BTW, The plan no need to store the real plan, store the normalized plan is ok. because store the plan need encode plan and will user snappy compress the plan, it will have some performance impact?

mistake.

crazycs520 · 2019-12-25T15:14:15Z

executor/adapter.go

@@ -876,6 +876,9 @@ func (a *ExecStmt) SummaryStmt() {
 	}
 	sessVars.SetPrevStmtDigest(digest)

+	plan := plannercore.EncodePlan(a.Plan)


No performance impact? can we use the normalized plan directly?

But it look ok from the result of benchmark Report...

I changed it to recording the plan only at the first time.

djshow832 · 2019-12-26T02:48:30Z

LGTM.
BTW, The plan no need to store the real plan, store the normalized plan is ok. because store the plan need encode plan and will user snappy compress the plan, it will have some performance impact?

I thought about it, but normalized plan is hard to read. E.g.

0	37_1	0	table:tidb, index:VARIABLE_NAME

I don't know what 0 and 37_1 means.

I'm thinking about printing the plan only at the first time, but I also want to make fields query_sample_text and plan are consistent.
So I made a benchmark, which shows it brings only a little performance degression.

lonng · 2019-12-26T03:45:04Z

util/stmtsummary/statement_summary.go

+	sampleSQL := sei.OriginalSQL
+	if len(sampleSQL) > int(maxSQLLength) {
+		// Make sure the memory of original `sampleSQL` will be released.
+		sampleSQL = string([]byte(sampleSQL[:maxSQLLength]))


Do we need to attach the information of sample query total length, e.g: select * from tt (total: 2304bytes)

crazycs520

LGTM

lonng

LGTM

lonng · 2019-12-26T06:59:29Z

/merge

sre-bot · 2019-12-26T07:01:42Z

/run-all-tests

djshow832 · 2019-12-30T14:09:30Z

/run-cherry-picker

sre-bot · 2019-12-30T14:10:08Z

cherry pick to release-3.0 in PR #14285

…14285)

djshow832 added status/DNM type/usability labels Dec 23, 2019

djshow832 force-pushed the stmt_summary_add_plan branch from 8a0324b to 3e342f8 Compare December 24, 2019 08:09

djshow832 removed the status/DNM label Dec 24, 2019

djshow832 requested review from AilinKid, bb7133, crazycs520 and lonng December 24, 2019 08:16

lonng reviewed Dec 24, 2019

View reviewed changes

lonng reviewed Dec 25, 2019

View reviewed changes

djshow832 added 7 commits December 25, 2019 20:18

add plan digest

c38b080

enable stmt summary

f4cb1a4

change MaxStmtCount to 200

21d885a

add testcases

91bb8a3

change max_stmt_count to 200

6db35cd

add test case for plan digest

00cbf9e

fix test case

2e0836c

djshow832 force-pushed the stmt_summary_add_plan branch from 3e342f8 to 2e0836c Compare December 25, 2019 12:26

djshow832 requested a review from a team as a code owner December 25, 2019 12:26

ghost requested review from eurekaka and francis0407 and removed request for a team December 25, 2019 12:26

lonng reviewed Dec 25, 2019

View reviewed changes

crazycs520 reviewed Dec 25, 2019

View reviewed changes

crazycs520 previously approved these changes Dec 25, 2019

View reviewed changes

crazycs520 reviewed Dec 25, 2019

View reviewed changes

djshow832 added 3 commits December 26, 2019 11:27

record the first shown SQL

b9c0cc4

fix test case

97cffb9

fix error messgae

526ed28

lonng reviewed Dec 26, 2019

View reviewed changes

format truncated sql

67cabe3

crazycs520 reviewed Dec 26, 2019

View reviewed changes

lonng reviewed Dec 26, 2019

View reviewed changes

lonng approved these changes Dec 26, 2019

View reviewed changes

sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Dec 26, 2019

Merge branch 'master' into stmt_summary_add_plan

5310de8

sre-bot merged commit fd3ada6 into pingcap:master Dec 26, 2019

djshow832 added the needs-cherry-pick-3.0 label Dec 30, 2019

sre-bot mentioned this pull request Dec 30, 2019

infoschema: add plan field to the statement summary tables (#14182) #14285

Merged

bb7133 pushed a commit that referenced this pull request Jan 2, 2020

infoschema: add plan field to the statement summary tables (#14182) (#…

e1d8b41

…14285)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

infoschema: add plan field to the statement summary tables #14182

infoschema: add plan field to the statement summary tables #14182

djshow832 commented Dec 23, 2019 •

edited

Loading

djshow832 commented Dec 23, 2019

sre-bot commented Dec 23, 2019

djshow832 commented Dec 24, 2019

djshow832 commented Dec 24, 2019

lonng Dec 24, 2019

djshow832 Dec 25, 2019

lonng Dec 25, 2019

djshow832 Dec 25, 2019

lonng left a comment

lonng left a comment

djshow832 commented Dec 25, 2019

crazycs520 left a comment

crazycs520 Dec 25, 2019 •

edited

Loading

crazycs520 Dec 25, 2019

djshow832 Dec 26, 2019

djshow832 commented Dec 26, 2019 •

edited

Loading

lonng Dec 26, 2019

djshow832 Dec 26, 2019

crazycs520 left a comment

lonng left a comment

lonng commented Dec 26, 2019

sre-bot commented Dec 26, 2019

djshow832 commented Dec 30, 2019

sre-bot commented Dec 30, 2019

infoschema: add plan field to the statement summary tables #14182

infoschema: add plan field to the statement summary tables #14182

Conversation

djshow832 commented Dec 23, 2019 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

djshow832 commented Dec 23, 2019

sre-bot commented Dec 23, 2019

Benchmark Report

djshow832 commented Dec 24, 2019

djshow832 commented Dec 24, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lonng left a comment

Choose a reason for hiding this comment

lonng left a comment

Choose a reason for hiding this comment

djshow832 commented Dec 25, 2019

crazycs520 left a comment

Choose a reason for hiding this comment

crazycs520 Dec 25, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djshow832 commented Dec 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crazycs520 left a comment

Choose a reason for hiding this comment

lonng left a comment

Choose a reason for hiding this comment

lonng commented Dec 26, 2019

sre-bot commented Dec 26, 2019

djshow832 commented Dec 30, 2019

sre-bot commented Dec 30, 2019

djshow832 commented Dec 23, 2019 •

edited

Loading

crazycs520 Dec 25, 2019 •

edited

Loading

djshow832 commented Dec 26, 2019 •

edited

Loading