Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

owner(ticdc): Add backoff mechanism into changefeed restart logic #4262

Merged

Conversation

zhaoxinyu
Copy link
Contributor

@zhaoxinyu zhaoxinyu commented Jan 10, 2022

What problem does this PR solve?

Issue Number: close #3329 close #3987

What is changed and how it works?

Before this optimization:
When errors occurred in a changefeed, a simple backoff is leveraged to restart a changefeed:
If there are three errors which occurred in two minutes, the changefeed stays in "error" state and won't be restarted imediately.

After this optimization:
We utilize an exponential backoff mechanism in the changefeed restart logic. The mechanism is elaborated as follows:

  1. When we need to restart a changefeed in "error" state, the restart interval is varying from an InitialInterval to a MaxInterval. And if the duration since backoff start exceed the MaxElapsedTime, the changefeed will be switched to "failed" state and will not be restarted until it is manually resumed.
  2. If the changefeed was running normally in a period of time and it encounters error at present, we can reset the backoff and let the restart interval vary from the InitialInterval.

Check List

Tests

  • Unit test
  • Integration test

Release note

Add exponential backoff mechanism for restarting a changefeed.

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jan 10, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • asddongmen
  • overvenus

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 10, 2022
@asddongmen asddongmen added area/ticdc Issues or PRs related to TiCDC. component/owner Owner component. labels Jan 10, 2022
@zhaoxinyu zhaoxinyu requested a review from overvenus January 10, 2022 09:13
@zhaoxinyu zhaoxinyu changed the title owner(ticdc): add backoff machanism into changefeed restart logic owner(ticdc): Add backoff mechanism into changefeed restart logic Jan 10, 2022
@overvenus overvenus added needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. needs-cherry-pick-release-5.0 Should cherry pick this PR to release-5.0 branch. needs-cherry-pick-release-5.1 Should cherry pick this PR to release-5.1 branch. needs-cherry-pick-release-5.2 Should cherry pick this PR to release-5.2 branch. needs-cherry-pick-release-5.3 Should cherry pick this PR to release-5.3 branch. needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. labels Jan 10, 2022
@overvenus overvenus added the type/bugfix This PR fixes a bug. label Jan 10, 2022
cdc/model/changefeed.go Outdated Show resolved Hide resolved
cdc/owner/feed_state_manager.go Outdated Show resolved Hide resolved
cdc/owner/feed_state_manager.go Outdated Show resolved Hide resolved
cdc/owner/feed_state_manager.go Outdated Show resolved Hide resolved
go.mod Show resolved Hide resolved
cdc/owner/feed_state_manager.go Outdated Show resolved Hide resolved
cdc/owner/feed_state_manager.go Outdated Show resolved Hide resolved
cdc/owner/feed_state_manager.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jan 11, 2022

Codecov Report

Merging #4262 (a731daf) into master (08da001) will increase coverage by 0.5104%.
The diff coverage is 52.2665%.

Flag Coverage Δ
cdc 59.4631% <64.6959%> (+0.8183%) ⬆️
dm 52.4802% <37.8669%> (+0.2257%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #4262        +/-   ##
================================================
+ Coverage   55.1722%   55.6826%   +0.5104%     
================================================
  Files           485        495        +10     
  Lines         59829      60843      +1014     
================================================
+ Hits          33009      33879       +870     
- Misses        23484      23545        +61     
- Partials       3336       3419        +83     

@Rustin170506
Copy link
Member

/run-kafka-integration-test

3 similar comments
@zhaoxinyu
Copy link
Contributor Author

/run-kafka-integration-test

@zhaoxinyu
Copy link
Contributor Author

/run-kafka-integration-test

@overvenus
Copy link
Member

/run-kafka-integration-test

@ti-chi-bot ti-chi-bot merged commit 58c7cc3 into pingcap:master Jan 14, 2022
ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jan 14, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4335.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jan 14, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4336.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jan 14, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4337.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jan 14, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4338.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Jan 14, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4339.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4340.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. component/owner Owner component. needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. needs-cherry-pick-release-5.0 Should cherry pick this PR to release-5.0 branch. needs-cherry-pick-release-5.1 Should cherry pick this PR to release-5.1 branch. needs-cherry-pick-release-5.2 Should cherry pick this PR to release-5.2 branch. needs-cherry-pick-release-5.3 Should cherry pick this PR to release-5.3 branch. needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need a better way for changefeed retry on some errors such as CDC:ErrJSONCodecRowTooLarge
9 participants