Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[store/tikv] support batch coprocessor for TiFlash #16030

Merged
merged 19 commits into from
Apr 9, 2020

Conversation

hanfei1991
Copy link
Member

@hanfei1991 hanfei1991 commented Apr 2, 2020

What problem does this PR solve?

Send batch coprocessor request which contains multiple regions to TiFlash Engine

Problem Summary:

What is changed and how it works?

What's Changed:
In old time, tidb sends requests to tiflash in the way of per region per requests. But most queries that send to tiflash will involves massive regions, which have huge overhead. And tiflash cannot decide the concurrency of runtime by itself.

How it Works:

  • Firstly, we squash multiple copTask to a huge batchCopTask, and compile it to protocal coprocessor.BatchRequest.
  • Then, It recieve request in a grpc streaming.
  • Notably, TiFlash will process RegionError and Lock Error by itself, and only return Other Error that cannot retry.

Related changes

we update kvproto pingcap/kvproto#586

####Side effects

We use a system variable tidb_allow_batch_cop to switch the batch cop mode. It has no effect on requests that send to tikv. For TiFlash, it only effects queries like 'Aggregation' and 'TopN' by default. In conclusion, its risk is very low.

**TPCH 50G Benchmark result **

old new
q1 14s 11.5s
q6 3.7s 2.6s

Release note

@hanfei1991 hanfei1991 requested a review from a team as a code owner April 2, 2020 16:54
@ghost ghost requested review from wshwsh12 and removed request for a team April 2, 2020 16:54
@github-actions github-actions bot added the sig/execution SIG execution label Apr 2, 2020
@hanfei1991 hanfei1991 requested a review from lzmhhh123 April 4, 2020 16:39
@codecov
Copy link

codecov bot commented Apr 7, 2020

Codecov Report

Merging #16030 into master will not change coverage by %.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #16030   +/-   ##
===========================================
  Coverage   80.5758%   80.5758%           
===========================================
  Files           506        506           
  Lines        137077     137077           
===========================================
  Hits         110451     110451           
  Misses        18133      18133           
  Partials       8493       8493           

@wshwsh12 wshwsh12 self-requested a review April 7, 2020 17:02

func (c *CopClient) sendBatch(ctx context.Context, req *kv.Request, vars *kv.Variables) kv.Response {
if req.KeepOrder || req.Desc {
return copErrorResponse{errors.New("batch coprocessor cannot prove keep order or desc property")}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since TiFlash can keep order for handle scan. When we enable the batch cop request, there is none guarantee to keep order for the handle column. So we should ban the order properties' satisfaction in the planner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not easy to consider allow_batch in planner. For safe and convient, I will check KeepOrder in executor/builder.go, If it is true, batch_cop is forced to set flase.

This part of code is remained as proctection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that's also another solution.

Copy link
Contributor

@lzmhhh123 lzmhhh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please resolve the conflicts and fix ci.

Copy link
Contributor

@wshwsh12 wshwsh12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM.

Copy link
Contributor

@wshwsh12 wshwsh12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@lzmhhh123 lzmhhh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lzmhhh123 lzmhhh123 added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/DNM labels Apr 9, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 9, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 9, 2020

cherry pick to release-4.0 in PR #16226

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants