Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mqbblp::RecoveryManager: Clear sync peer promptly #589

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

kaikulimu
Copy link
Collaborator

@kaikulimu kaikulimu commented Jan 28, 2025

See internal ticket 178057288

Clearing sync peer promptly prevents this failure condition to trigger when the sync peer node goes down after sync with that peer is already complete. Otherwise, we will call onPartitionPrimarySyncStatus(partitionId, -1 /* status */); and fail this assert later on because primary is already active.

@kaikulimu kaikulimu requested a review from a team as a code owner January 28, 2025 20:23
@kaikulimu kaikulimu requested a review from chrisbeard January 28, 2025 20:23
@kaikulimu
Copy link
Collaborator Author

Can ignore IncoreCSL unit test failure. Will be fixed in another branch

@kaikulimu kaikulimu requested review from dorjesinpo and removed request for chrisbeard January 30, 2025 16:08
@kaikulimu kaikulimu assigned dorjesinpo and unassigned chrisbeard Jan 30, 2025
@kaikulimu kaikulimu requested review from chrisbeard and removed request for dorjesinpo February 4, 2025 16:32
@kaikulimu kaikulimu assigned chrisbeard and unassigned dorjesinpo Feb 4, 2025
@kaikulimu kaikulimu requested a review from dorjesinpo February 4, 2025 16:33
@kaikulimu kaikulimu assigned dorjesinpo and unassigned chrisbeard Feb 4, 2025
Copy link
Collaborator

@dorjesinpo dorjesinpo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logs source twice (instead of primarySyncPeer):https://bbgithub.dev.bloomberg.com/BMQ/blazingmq-mirror/blob/85fa14b429195474314a4fc1675b5bc923099a6e/src/groups/mqb/mqbblp/mqbblp_storagemanager.cpp#L813

Can we document/comment somewhere how do RecoveryManager_PrimarySyncContext::primarySyncInProgress() and RecoveryManager_PrimarySyncContext::syncPeer() relate? What are the scenarios when one is 0 and the other is not (if any)?

@dorjesinpo dorjesinpo assigned kaikulimu and unassigned dorjesinpo Feb 5, 2025
@kaikulimu
Copy link
Collaborator Author

@dorjesinpo Added some explanation. Hope that clears things up.

@kaikulimu kaikulimu changed the title Fix mqbblp::RecoveryManager: Clear sync peer promptly Fix mqbblp::RecoveryManager: Clear sync peer promptly [178057288] Feb 5, 2025
@kaikulimu kaikulimu changed the title Fix mqbblp::RecoveryManager: Clear sync peer promptly [178057288] Fix mqbblp::RecoveryManager: Clear sync peer promptly Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants