Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

owner(cdc): asynchronously close the ddl sink #4249

Closed

Conversation

3AceShowHand
Copy link
Contributor

@3AceShowHand 3AceShowHand commented Jan 7, 2022

What problem does this PR solve?

Issue Number: close #4241, #4384

when closing the ddlSink with Kafka as the sink, it would block the owner or processorManager for a period of time, which had a negative impact on all running changefeeds.

  • For Kafka, this would happen when all brokers were shut down.

What is changed and how it works?

At the owner side:

  • close theddlSink asynchronsly
  • for changefeed initialization, introduce changeFeedRunningStatus, at the moment for changeFeedClosed, changeFeedRunning, changeFeedClosing.

changeFeedClose -> changeFeedRunning if initialize the changefeed.

changeFeedRunning -> changefeedClosing if close the changefeed

changeFeedClosing -> changeFeedClosed if the downstream sink fully closed.

without introducing changeFeedClosing, a closing changefeed may be reinitializing at the same time, also may be closing multiple times.

At the processorManager side:

  • initialize and close processor's sinkManager in an asynchronous way.
  • Like above, introduce processorRunningStatus, at the moment for processorClosed, processorInitializing, processorRunning, processorClosing.

processorClosed -> processorInitializing, if try to initialize the sink
processorInitializing -> processorRunning, if the sinkManager is successfully initialized.
processorRunning -> processorClosing, if try to close the processor.
processorClosing -> processorClosed if the sinkManager is fully closed, not matter success or not.

  • TablePipeline should not be created before the status is processorRunning.

Check List

Tests

  • Manually test
  • create 2 changefeed, 1 for MySQL, 1 for Kafka with open protocol, list changefeed show that two works fine.
  • kill -9 the Kafka broker, and wait for 1 minute, logs show that owner receives an error and try to close the Kafka changefeed.
  • During the Kafka closing period, which would cost another 1 minute, MySQL Changefeed still works normally.

Code changes

Side effects

Related changes

  • Need to cherry-pick to the release branch

Release note

fix the owner may be blocked when closing the changefeed with Kafka sink which has a bad network connection between Kafka brokers.

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 7, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jan 7, 2022

Codecov Report

Merging #4249 (7443202) into master (e469fb3) will decrease coverage by 0.4806%.
The diff coverage is 48.9864%.

Flag Coverage Δ
cdc 59.1842% <46.0869%> (-1.0032%) ⬇️
dm 52.0540% <59.0909%> (-0.0496%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #4249        +/-   ##
================================================
- Coverage   55.8077%   55.3270%   -0.4807%     
================================================
  Files           495        495                
  Lines         61246      61422       +176     
================================================
- Hits          34180      33983       -197     
- Misses        23614      24038       +424     
+ Partials       3452       3401        -51     

@3AceShowHand 3AceShowHand force-pushed the kafka-4241-out-of-brokers branch from 4c54774 to f8ee1dd Compare January 10, 2022 05:49
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 11, 2022
@3AceShowHand 3AceShowHand changed the title Kafka 4241 out of brokers owner(cdc): asynchronously close the ddl sink Jan 11, 2022
@3AceShowHand
Copy link
Contributor Author

/run-all-tests

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. release-note-none Denotes a PR that doesn't merit a release note. labels Jan 11, 2022
@3AceShowHand
Copy link
Contributor Author

/run-all-tests

2 similar comments
@3AceShowHand
Copy link
Contributor Author

/run-all-tests

@3AceShowHand
Copy link
Contributor Author

/run-all-tests

@ti-chi-bot ti-chi-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 14, 2022
@3AceShowHand 3AceShowHand force-pushed the kafka-4241-out-of-brokers branch from 8283ed5 to 1e05755 Compare January 15, 2022 04:06
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 15, 2022
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 26, 2022
@ti-chi-bot ti-chi-bot added needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 10, 2022
@ti-chi-bot
Copy link
Member

@3AceShowHand: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 23, 2022
@Rustin170506
Copy link
Member

It looks like we don't care if the close is successful, so can we just add a close timeout? That way the impact on the owner and processor is minimal?

@3AceShowHand
Copy link
Contributor Author

It looks like we don't care if the close is successful, so can we just add a close timeout? That way the impact on the owner and processor is minimal?

Such a good idea, I will do this in another new PR. #5186

@3AceShowHand
Copy link
Contributor Author

closed by #5186

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mysql changefeed blocked by abnormal kafka changefeed
7 participants