Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm: fix validator deadlock and enhance retry #9522

Merged
merged 3 commits into from
Aug 11, 2023
Merged

Conversation

D3Hunter
Copy link
Contributor

@D3Hunter D3Hunter commented Aug 8, 2023

What problem does this PR solve?

Issue Number: close #9257

What is changed and how it works?

  • fix deadlock between error-process-routine and worker routine when err channel is full
  • remove object lock when fillResult
  • enhance retry by reusing the resumable errors in syncer

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

dm: fix validator deadlock and enhance retry

@ti-chi-bot ti-chi-bot bot added needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels Aug 8, 2023
@D3Hunter D3Hunter added the area/dm Issues or PRs related to DM. label Aug 8, 2023
@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 8, 2023
@D3Hunter
Copy link
Contributor Author

D3Hunter commented Aug 8, 2023

/retest

@D3Hunter D3Hunter requested a review from lance6716 August 9, 2023 02:25
@D3Hunter
Copy link
Contributor Author

@lance6716 @GMHDBJD

@D3Hunter D3Hunter requested a review from GMHDBJD August 10, 2023 07:13
Copy link
Contributor

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will review later

@@ -199,7 +204,7 @@ type DataValidator struct {

// fields in this field block are guarded by stateMutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we can update the comment about which members this mutex guards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which field do you mean? the range is not changed

if err == context.Canceled {
return false
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as its name, should we check it's a DB error? like a error instance from mysql.MySQLError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, as in comment says, it's a black list checking, we only filter known non-retriable errors.

this method is used to check whether db operation can retry, not just db error, network/conn error to

Copy link
Contributor

@GMHDBJD GMHDBJD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Aug 10, 2023
@ti-chi-bot ti-chi-bot bot added the lgtm label Aug 11, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Aug 11, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: GMHDBJD, lance6716

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Aug 11, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Aug 11, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-08-10 08:29:47.091462944 +0000 UTC m=+187751.640478932: ☑️ agreed by GMHDBJD.
  • 2023-08-11 02:23:58.002139264 +0000 UTC m=+252202.551155250: ☑️ agreed by lance6716.

@lance6716
Copy link
Contributor

/retest

@ti-chi-bot ti-chi-bot bot merged commit 4738442 into master Aug 11, 2023
ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request Aug 11, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #9544.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #9545.

@D3Hunter D3Hunter deleted the validator-deadlock branch August 11, 2023 03:40
ti-chi-bot bot pushed a commit that referenced this pull request Aug 11, 2023
ti-chi-bot bot pushed a commit that referenced this pull request Aug 13, 2023
3AceShowHand pushed a commit to 3AceShowHand/tiflow that referenced this pull request Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/dm Issues or PRs related to DM. lgtm needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dm validator should retry on transient error and there should be no deadlock when bad thing happens
4 participants