-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd (ticdc): add grpc keepalive params and add timeout for check pd version ctx. #9106
etcd (ticdc): add grpc keepalive params and add timeout for check pd version ctx. #9106
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/test all |
/test all |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 2bf6322
|
@asddongmen: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests
If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/test dm-integration-test |
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #8808
What is changed and how it works?
These are very small changes.
Currently tested 30 times, test results show that the highest delay of changefeed is 5 minutes, the lowest delay is zero, and the average delay is about 1 minute. The cause of the delay is that some cdc nodes will restart capture due to etcd session disconnect, which cannot be avoided; increasing the value of capture-session-ttl can slightly reduce the probability of this happening, but it cannot be completely avoided.
This PR cannot completely solve the problem of cdc delays when pd leader io hang errors are injected. Currently, the test results are quite mysterious, with 5 out of 10 times having issues and 5 times not having issues.
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note