Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: support the hot-region-scheduler to generate multiple operators at the same time #4931

Merged
merged 4 commits into from
Jul 14, 2022

Conversation

HunDunDM
Copy link
Member

@HunDunDM HunDunDM commented May 12, 2022

What problem does this PR solve?

Issue Number: Ref #4949

What is changed and how does it work?

Allows a solution to contain multiple Operators to expand the search set, guaranteeing that a better solution can be found in some cases.

Check List

Tests

  • Unit test

Release note

Optimized the search strategy of the balance-hot-region-scheduler.

@ti-chi-bot
Copy link
Member

ti-chi-bot commented May 12, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lhy1024
  • nolouch

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels May 12, 2022
@ti-chi-bot ti-chi-bot requested a review from rleungx May 12, 2022 06:33
@HunDunDM HunDunDM requested review from lhy1024 and removed request for rleungx May 12, 2022 06:34
@HunDunDM HunDunDM force-pushed the multi-op branch 4 times, most recently from dd984d3 to d97bd4f Compare May 12, 2022 19:20
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 16, 2022
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 17, 2022
@HunDunDM HunDunDM marked this pull request as ready for review May 17, 2022 19:15
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 17, 2022
@codecov
Copy link

codecov bot commented May 17, 2022

Codecov Report

Merging #4931 (09fd67f) into master (34a4cce) will decrease coverage by 0.24%.
The diff coverage is 88.46%.

@@            Coverage Diff             @@
##           master    #4931      +/-   ##
==========================================
- Coverage   75.87%   75.62%   -0.25%     
==========================================
  Files         311      311              
  Lines       30830    30909      +79     
==========================================
- Hits        23392    23375      -17     
- Misses       5450     5527      +77     
- Partials     1988     2007      +19     
Flag Coverage Δ
unittests 75.62% <88.46%> (-0.25%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/schedulers/hot_region.go 84.17% <88.46%> (+0.65%) ⬆️
pkg/dashboard/keyvisual/input/core.go 0.00% <0.00%> (-33.34%) ⬇️
server/tso/local_allocator.go 64.86% <0.00%> (-13.52%) ⬇️
server/schedulers/shuffle_hot_region.go 55.55% <0.00%> (-10.11%) ⬇️
pkg/dashboard/adapter/manager.go 79.31% <0.00%> (-6.90%) ⬇️
server/storage/endpoint/meta.go 64.04% <0.00%> (-4.50%) ⬇️
server/tso/allocator_manager.go 61.39% <0.00%> (-4.33%) ⬇️
pkg/etcdutil/etcdutil.go 84.88% <0.00%> (-3.49%) ⬇️
server/tso/tso.go 67.79% <0.00%> (-3.39%) ⬇️
server/schedule/hbstream/heartbeat_streams.go 72.72% <0.00%> (-2.03%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34a4cce...09fd67f. Read the comment docs.

@nolouch
Copy link
Contributor

nolouch commented Jun 8, 2022

ptal @lhy1024

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 8, 2022
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 8, 2022
@HunDunDM HunDunDM requested a review from nolouch June 9, 2022 04:09
@@ -497,8 +537,43 @@ func (bs *balanceSolver) solve() []*operator.Operator {
bs.cur.dstStore = dstStore
bs.calcProgressiveRank()
tryUpdateBestSolution()

if searchRevertRegions && (bs.cur.progressiveRank >= -1 && bs.cur.progressiveRank <= 0) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if searchRevertRegions && (bs.cur.progressiveRank >= -1 && bs.cur.progressiveRank <= 0) &&
if searchRevertRegions && (bs.cur.progressiveRank == -1 || bs.cur.progressiveRank == 0) &&

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should support rank1 later.

server/schedulers/hot_region.go Outdated Show resolved Hide resolved
Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the resources are sufficient, the two branches of master and multi-op behave similarly, and there is almost no scheduling after stabilization which means the solution can be found (split.qps-threshold=500) , and there is still scheduling after the solution cannot be found (split.qps-threshold=3000)

When resources are insufficient, the number of peak scheduling of multi-op is more

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jun 30, 2022
Copy link
Contributor

@nolouch nolouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

break
}
bs.decorateOperator(currentOp, true, targetLabel, sourceLabel, typ, dim)
ops = append(ops, currentOp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the RandBuckets inwaiting_operator will be influenced. The GetOperator may only get first one operator and miss the second one.

server/schedulers/hot_region.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2022
Signed-off-by: HunDunDM <[email protected]>
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2022
Signed-off-by: HunDunDM <[email protected]>
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 14, 2022
@nolouch
Copy link
Contributor

nolouch commented Jul 14, 2022

/merge

@ti-chi-bot
Copy link
Member

@nolouch: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 28204c0

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 14, 2022
@ti-chi-bot ti-chi-bot merged commit 8ab5c16 into tikv:master Jul 14, 2022
@HunDunDM HunDunDM deleted the multi-op branch July 14, 2022 07:45
lhy1024 added a commit to lhy1024/pd that referenced this pull request Aug 3, 2022
…ple operators at the same time (tikv#4931)"

This reverts commit 8ab5c16.

Signed-off-by: lhy1024 <[email protected]>

Conflicts:
	server/schedulers/hot_region_test.go
ti-chi-bot pushed a commit that referenced this pull request Aug 4, 2022
@lhy1024
Copy link
Contributor

lhy1024 commented Sep 9, 2022

/run-build-arm64 comment=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants