Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/client: support reconnect regions when no event received too long (#1591,#1674,#1686) #1668

Merged

Conversation

ti-srebot
Copy link
Contributor

@ti-srebot ti-srebot commented Apr 14, 2021

cherry-pick #1591 to release-4.0
You can switch your code base to this Pull Request by using git-extras:

# In ticdc repo:
git pr https://github.com/pingcap/ticdc/pull/1668

After apply modifications, you can push your change to this PR via:

git push [email protected]:ti-srebot/ticdc.git pr/1668:release-4.0-fd90b40f37e4

What problem does this PR solve?

Solve https://github.com/pingcap/ticdc/issues/1586 (This is the master PR, release-5.0 PR is https://github.com/pingcap/ticdc/pull/1589)

NOTE, need discussion:

  1. The check only workers after region is initialized, is this reasonable.
  2. The reconnect interval (15m) is hard coded, we need to discuss whether to add a config in server config file

After discussion

  1. Keep check after the region is initialized, because the incremental scan phase could cost long.
  2. We don't need to add it in server config file, since the check is only in real time data phase.

What is changed and how it works?

  • Reuse kv event aliveness check in lock resolver tick
  • If a region doesn't receive any event for more than 15 minutes, force the gRPC stream to reconnect
  • Only update the kv client v1, kv client v2 will be updated later

Check List

Tests

  • Unit test
  • Integration test

Release note

  • Force the gRPC stream to reconnect if no event (including resolved ts event) is received for more than 15 minutes.

@ti-srebot
Copy link
Contributor Author

/run-all-tests

@ti-srebot ti-srebot added component/kv-client TiKV kv log client component. status/ptal Could you please take a look? size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. type/4.0-cherry-pick labels Apr 14, 2021
@ti-srebot ti-srebot requested review from zier-one and overvenus April 14, 2021 13:00
@ti-chi-bot ti-chi-bot requested a review from amyangfei April 14, 2021 13:00
@ti-srebot ti-srebot added this to the v4.0.13 milestone Apr 14, 2021
@sre-bot
Copy link

sre-bot commented Apr 14, 2021

@amyangfei amyangfei force-pushed the release-4.0-fd90b40f37e4 branch from 3dbacff to 4efc35c Compare April 28, 2021 03:06
@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 28, 2021
@amyangfei amyangfei changed the base branch from release-4.0 to release-4.0-pending April 28, 2021 03:07
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 28, 2021
@amyangfei
Copy link
Contributor

/run-all-tests

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 28, 2021
@amyangfei
Copy link
Contributor

/run-all-tests

@amyangfei
Copy link
Contributor

/run-all-tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (release-4.0-pending@6602c6a). Click here to learn what that means.
The diff coverage is n/a.

@@                   Coverage Diff                    @@
##             release-4.0-pending      #1668   +/-   ##
========================================================
  Coverage                       ?   50.6473%           
========================================================
  Files                          ?        152           
  Lines                          ?      15835           
  Branches                       ?          0           
========================================================
  Hits                           ?       8020           
  Misses                         ?       6978           
  Partials                       ?        837           

@zier-one
Copy link
Contributor

/lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • amyangfei
  • leoppro

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: f9c5cf9

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 29, 2021
@amyangfei
Copy link
Contributor

/run-all-tests

@amyangfei
Copy link
Contributor

/run-kafka-tests

@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Apr 29, 2021
@amyangfei
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: f3db866

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 29, 2021
@amyangfei
Copy link
Contributor

cherry-pick #1674 together, close #1679

@amyangfei
Copy link
Contributor

/run-unit-tests

@amyangfei
Copy link
Contributor

/run-integration-tests

@amyangfei
Copy link
Contributor

/run-all-tests

@amyangfei
Copy link
Contributor

/run-kafka-tests

@amyangfei amyangfei changed the title kv/client: support reconnect regions when no event received too long (#1591) kv/client: support reconnect regions when no event received too long (#1591,#1674,#1686) Apr 29, 2021
@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Apr 29, 2021
@amyangfei
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 53bb12c

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 29, 2021
@ti-chi-bot
Copy link
Member

@ti-srebot: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@amyangfei
Copy link
Contributor

/run-all-tests

@amyangfei
Copy link
Contributor

/run-leak-tests

@amyangfei amyangfei merged commit 8158b03 into pingcap:release-4.0-pending Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kv-client TiKV kv log client component. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants