Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fake(engine): add cached checkpoint in job master #6599

Merged
merged 10 commits into from
Aug 4, 2022

Conversation

amyangfei
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #6598

What is changed and how it works?

Cache checkpoint in job master, if part of workers are offline, keep the old value of them.

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

None

@amyangfei amyangfei added the area/engine Issues or PRs related to Dataflow Engine. label Aug 3, 2022
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Aug 3, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • CharlesCheung96
  • maxshuang

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 3, 2022
@amyangfei
Copy link
Contributor Author

/run-check-issue-triage-complete

@amyangfei amyangfei added the status/ptal Could you please take a look? label Aug 3, 2022
@amyangfei
Copy link
Contributor Author

@maxshuang
Copy link
Contributor

@amyangfei Could you explain more about this unstable test? I think I don't real get the point.

@amyangfei
Copy link
Contributor Author

amyangfei commented Aug 4, 2022

@amyangfei Could you explain more about this unstable test? I think I don't real get the point.

The root cause was explained in the original issue. @maxshuang

@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2022

Codecov Report

Merging #6599 (91e43db) into master (2ea67ba) will increase coverage by 0.0023%.
The diff coverage is 88.6792%.

❗ Current head 91e43db differs from pull request most recent head 8dc9f4f. Consider uploading reports for the commit 8dc9f4f to get more accurate results

Flag Coverage Δ
cdc 65.9174% <71.4285%> (+0.0493%) ⬆️
dm 51.8930% <ø> (-0.1487%) ⬇️
engine 62.7706% <92.3664%> (+0.2345%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #6599        +/-   ##
================================================
+ Coverage   59.2138%   59.2161%   +0.0023%     
================================================
  Files           780        780                
  Lines         88405      88312        -93     
================================================
- Hits          52348      52295        -53     
+ Misses        31415      31378        -37     
+ Partials       4642       4639         -3     

@ti-chi-bot ti-chi-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 4, 2022
@maxshuang
Copy link
Contributor

https://github.com/pingcap/tiflow/runs/7666668553?check_suite_focus=true
Another case for this issue.
Seems this case becomes unstable after pr #6436

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 4, 2022
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 4, 2022
@amyangfei
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: b1e4733

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 4, 2022
@maxshuang
Copy link
Contributor

After offline discussion with amyangfei and CharlesCheung96,
we need to investigate whether the new merged pr causes the behavior that job is dispatched to an offline executor.

@ti-chi-bot
Copy link
Member

@amyangfei: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit 5bd84ae into pingcap:master Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/engine Issues or PRs related to Dataflow Engine. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

node failure test is not stable
5 participants