Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infoschema: add plan field to the statement summary tables #14182

Merged
merged 12 commits into from
Dec 26, 2019

Conversation

djshow832
Copy link
Contributor

@djshow832 djshow832 commented Dec 23, 2019

What problem does this PR solve?

Add plan and plan digest to the statement summary tables.
Same SQLs with different plans are summarized in different records in the tables.
Record the SQL and plan that appear at the first time in each summary, not the last time, to increase performance.

Since same SQL may be split into several records, max-stmt-count should increase.

What is changed and how it works?

Add plan_digest and plan fields.
schema_name + digest + prev_sql_digest + plan_digest are combined as a key of the summary map.
Default value of max-stmt-count is increased to 200.
Record the SQL and plan that appear at the first time in each summary, not the last time.

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)
mysql> select * from performance_schema.events_statements_summary_by_digest where plan != ''\G
*************************** 1. row ***************************
       SUMMARY_BEGIN_TIME: 2019-12-22 16:00:00
         SUMMARY_END_TIME: 2019-12-22 16:30:00
                STMT_TYPE: select
              SCHEMA_NAME:
                   DIGEST: ff1ba2b3cf4f7452291642f6399447979825a310ef25a78f01288580f398a3c9
              DIGEST_TEXT: select table_id , is_index , hist_id , distinct_count , version , null_count , tot_col_size , stats_ver , flag , correlation , last_analyze_pos from mysql . stats_histograms where table_id = ?
              TABLE_NAMES: mysql.stats_histograms
              INDEX_NAMES: stats_histograms:tbl
              SAMPLE_USER: NULL
               EXEC_COUNT: 17
              SUM_LATENCY: 17842186
              MAX_LATENCY: 1417064
              MIN_LATENCY: 697887
              AVG_LATENCY: 1049540
        AVG_PARSE_LATENCY: 91016
        MAX_PARSE_LATENCY: 150406
      AVG_COMPILE_LATENCY: 309173
      MAX_COMPILE_LATENCY: 389786
             COP_TASK_NUM: 34
     AVG_COP_PROCESS_TIME: 0
     MAX_COP_PROCESS_TIME: 0
  MAX_COP_PROCESS_ADDRESS: NULL
        AVG_COP_WAIT_TIME: 0
        MAX_COP_WAIT_TIME: 0
     MAX_COP_WAIT_ADDRESS: NULL
         AVG_PROCESS_TIME: 0
         MAX_PROCESS_TIME: 0
            AVG_WAIT_TIME: 0
            MAX_WAIT_TIME: 0
         AVG_BACKOFF_TIME: 0
         MAX_BACKOFF_TIME: 0
           AVG_TOTAL_KEYS: 0
           MAX_TOTAL_KEYS: 0
       AVG_PROCESSED_KEYS: 0
       MAX_PROCESSED_KEYS: 0
        AVG_PREWRITE_TIME: 0
        MAX_PREWRITE_TIME: 0
          AVG_COMMIT_TIME: 0
          MAX_COMMIT_TIME: 0
   AVG_GET_COMMIT_TS_TIME: 0
   MAX_GET_COMMIT_TS_TIME: 0
  AVG_COMMIT_BACKOFF_TIME: 0
  MAX_COMMIT_BACKOFF_TIME: 0
    AVG_RESOLVE_LOCK_TIME: 0
    MAX_RESOLVE_LOCK_TIME: 0
AVG_LOCAL_LATCH_WAIT_TIME: 0
MAX_LOCAL_LATCH_WAIT_TIME: 0
           AVG_WRITE_KEYS: 0
           MAX_WRITE_KEYS: 0
           AVG_WRITE_SIZE: 0
           MAX_WRITE_SIZE: 0
     AVG_PREWRITE_REGIONS: 0
     MAX_PREWRITE_REGIONS: 0
            AVG_TXN_RETRY: 0
            MAX_TXN_RETRY: 0
        SUM_BACKOFF_TIMES: 0
            BACKOFF_TYPES: NULL
                  AVG_MEM: 16672
                  MAX_MEM: 16672
        AVG_AFFECTED_ROWS: 0
               FIRST_SEEN: 2019-12-22 16:18:25
                LAST_SEEN: 2019-12-22 16:19:13
        QUERY_SAMPLE_TEXT: select table_id, is_index, hist_id, distinct_count, version, null_count, tot_col_size, stats_ver, flag, correlation, last_analyze_pos from mysql.stats_histograms where table_id = 43
         PREV_SAMPLE_TEXT:
              PLAN_DIGEST: db05f9e134185031e52539c303d1930de3cf6e5d4881aee7ef97ba87a9a0a6d9
                     PLAN: 	Projection_4    	root	10	mysql.stats_histograms.table_id, mysql.stats_histograms.is_index, mysql.stats_histograms.hist_id, mysql.stats_histograms.distinct_count, mysql.stats_histograms.version, mysql.stats_histograms.null_count, mysql.stats_histograms.tot_col_size, mysql.stats_histograms.stats_ver, mysql.stats_histograms.flag, mysql.stats_histograms.correlation, mysql.stats_histograms.last_analyze_pos
	└─IndexLookUp_10	root	10
	  ├─IndexScan_8 	cop 	10	table:stats_histograms, index:table_id, is_index, hist_id, range:[43,43], keep order:false, stats:pseudo
	  └─TableScan_9 	cop 	10	table:stats_histograms, keep order:false, stats:pseudo

Code changes

  • N/A

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release note

  • And plan and plan_digest to the statement summary tables.
  • Change default value of configuration max-stmt-count in [stmt-summary] to 200.

@djshow832
Copy link
Contributor Author

/bench

@sre-bot
Copy link
Contributor

sre-bot commented Dec 23, 2019

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 8cbacf0d7c4612e6b5fa89836ca170af11eb81b2
+++ tidb: 012a939e38062d51f79191ce401fee29e8c5e5f9
tikv: 6ccfe673d60a19626ba19331500d661390ea305c
pd: 4d65bbefbc6db6e92fee33caa97415274512757a
================================================================================
oltp_update_non_index:
    * QPS: 4720.55 ± 0.15% (std=4.92) delta: 0.16% (p=0.272)
    * Latency p50: 27.11 ± 0.00% (std=0.00) delta: -0.17%
    * Latency p99: 41.10 ± 0.00% (std=0.00) delta: -2.41%
            
oltp_insert:
    * QPS: 4773.11 ± 0.29% (std=10.25) delta: 0.11% (p=0.503)
    * Latency p50: 26.81 ± 0.27% (std=0.05) delta: -0.10%
    * Latency p99: 45.85 ± 7.07% (std=2.24) delta: -0.81%
            
oltp_read_write:
    * QPS: 15155.76 ± 0.07% (std=7.68) delta: 0.13% (p=0.755)
    * Latency p50: 169.23 ± 0.08% (std=0.10) delta: -0.13%
    * Latency p99: 307.67 ± 5.95% (std=11.48) delta: -4.42%
            
oltp_update_index:
    * QPS: 4250.36 ± 0.10% (std=2.85) delta: 0.46% (p=0.275)
    * Latency p50: 30.11 ± 0.09% (std=0.02) delta: -0.47%
    * Latency p99: 53.87 ± 3.62% (std=1.53) delta: 0.04%
            
oltp_point_select:
    * QPS: 38721.92 ± 0.32% (std=85.51) delta: 0.65% (p=0.035)
    * Latency p50: 3.30 ± 0.30% (std=0.01) delta: -0.68%
    * Latency p99: 10.00 ± 0.90% (std=0.09) delta: -0.45%
            

@djshow832
Copy link
Contributor Author

/run-all-tests

@djshow832 djshow832 force-pushed the stmt_summary_add_plan branch from 8a0324b to 3e342f8 Compare December 24, 2019 08:09
@djshow832
Copy link
Contributor Author

/run-all-tests

key.hash = append(key.hash, hack.Slice(key.digest)...)
key.hash = append(key.hash, hack.Slice(key.schemaName)...)
key.hash = append(key.hash, hack.Slice(key.prevDigest)...)
key.hash = append(key.hash, hack.Slice(key.planDigest)...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same plan but different row count may cause the hash different, is it expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, row count is not in normalized plan, so plan digest are the same.
Refer to https://github.com/pingcap/tidb/blob/master/util/plancodec/codec.go#L274

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some tests to cover this case (different plan but the same plan digest).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. Refer to 00cbf9e

Copy link
Contributor

@lonng lonng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

@djshow832 djshow832 force-pushed the stmt_summary_add_plan branch from 3e342f8 to 2e0836c Compare December 25, 2019 12:26
@djshow832 djshow832 requested a review from a team as a code owner December 25, 2019 12:26
@ghost ghost requested review from eurekaka and francis0407 and removed request for a team December 25, 2019 12:26
Copy link
Contributor

@lonng lonng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
(Prefer to use a variable to store the failpoint name)

@djshow832
Copy link
Contributor Author

/run-unit-test

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
BTW, The plan no need to store the real plan, store the normalized plan is ok. because store the plan need encode plan and will user snappy compress the plan, it will have some performance impact?

crazycs520
crazycs520 previously approved these changes Dec 25, 2019
@@ -876,6 +876,9 @@ func (a *ExecStmt) SummaryStmt() {
}
sessVars.SetPrevStmtDigest(digest)

plan := plannercore.EncodePlan(a.Plan)
Copy link
Contributor

@crazycs520 crazycs520 Dec 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No performance impact? can we use the normalized plan directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it look ok from the result of benchmark Report...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to recording the plan only at the first time.

@djshow832
Copy link
Contributor Author

djshow832 commented Dec 26, 2019

LGTM.
BTW, The plan no need to store the real plan, store the normalized plan is ok. because store the plan need encode plan and will user snappy compress the plan, it will have some performance impact?

I thought about it, but normalized plan is hard to read. E.g.

0	37_1	0	table:tidb, index:VARIABLE_NAME

I don't know what 0 and 37_1 means.

I'm thinking about printing the plan only at the first time, but I also want to make fields query_sample_text and plan are consistent.
So I made a benchmark, which shows it brings only a little performance degression.

sampleSQL := sei.OriginalSQL
if len(sampleSQL) > int(maxSQLLength) {
// Make sure the memory of original `sampleSQL` will be released.
sampleSQL = string([]byte(sampleSQL[:maxSQLLength]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to attach the information of sample query total length, e.g: select * from tt (total: 2304bytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@lonng lonng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lonng
Copy link
Contributor

lonng commented Dec 26, 2019

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Dec 26, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Dec 26, 2019

/run-all-tests

@sre-bot sre-bot merged commit fd3ada6 into pingcap:master Dec 26, 2019
@djshow832
Copy link
Contributor Author

/run-cherry-picker

@sre-bot
Copy link
Contributor

sre-bot commented Dec 30, 2019

cherry pick to release-3.0 in PR #14285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/can-merge Indicates a PR has been approved by a committer. type/usability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants