Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: adjust locking in signing_shares avoid needing one main RecursiveMutex and instead using smaller regular Mutexes #6410

Conversation

PastaPastaPasta
Copy link
Member

What was done?

Refactor the locking / mutex usage in signing_shares to focus on smaller scopes / avoid recursive mutex

How Has This Been Tested?

Ran functional tests lots of times with tsan / enable-debug

Breaking Changes

None

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@PastaPastaPasta PastaPastaPasta added this to the 22.1 milestone Nov 19, 2024
Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linter complains

@PastaPastaPasta PastaPastaPasta changed the title refactor: adjust locking in signing_shares avoid needing one main Rec ursiveMutex and instead using smaller regular Mutexes refactor: adjust locking in signing_shares avoid needing one main RecursiveMutex and instead using smaller regular Mutexes Nov 19, 2024
…ursiveMutex and instead using smaller regular Mutexes
@PastaPastaPasta PastaPastaPasta force-pushed the refactor-mutexes-signing-shares branch from c8362b3 to 6da2f28 Compare November 19, 2024 23:08
@PastaPastaPasta
Copy link
Member Author

feature_llmq_rotation.py failed in tsan CI; gonna run that locally a bit.

std::thread workThread;
CThreadInterrupt workInterrupt;

SigShareMap<CSigShare> sigShares GUARDED_BY(cs);
std::unordered_map<uint256, CSignedSession, StaticSaltedHasher> signedSessions GUARDED_BY(cs);
Mutex cs_sigShares;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it became a bit over-complicated IMO, 7 mutexes in just one class. Does it indeed improve performance by any benchmark? Maybe you can localize some specific data that affect performance and put only that one by its own mutex?

It looks like DEADLOCKS ARE WELCOME every time you change something a bit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bitcoin has tried to move away from using recursive mutexes as much as possible; and generally we've tried to follow. I think if you tried to do this with 1 non-recursive mutex, you'd have a very bad time. Instead we introduce mutex for the actual data structures they need to guard, and use annotations to give us decent assurance there are no issues

Copy link
Collaborator

@knst knst Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in theory it sounds good, but how to proof validness of each piece of data, if there no common lock? Is there any dependency between members? Is it true that each piece of member can be changed without side effect to other calculations in neighbors function?

Maybe it will work, but that's a nightmare to proof that there is jo hidden dependencies between members and review that we can't get invalid state now.

Yes, bitcoin is moving away from RecursiveMutex, but it's not like "give own mutex to each variable in the header", at least most times

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe collect logs Enter: lock contention.... started / completed at least from some node to narrow down scope?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree... Please see 6417 as a much more minimal alternative

@UdjinM6 UdjinM6 removed this from the 22.1 milestone Nov 20, 2024
PastaPastaPasta added a commit that referenced this pull request Nov 21, 2024
397a157 refactor: introduce cs_pendingSigns (pasta)

Pull request description:

  ## Issue being fixed or feature implemented
  Much more minimal version of #6410; data from testnet while MNs were doing lots of instantsend locking showed minimal contention over signing_shares cs, however, the contention that did exist, most was a result of AsyncSign conflicting with something else. This should resolve the vast majority of contention in signing_shares while minimizing the complication / risk of introducing dead locks compared to 6410

  ## What was done?
  introduce cs_pendingSigns

  ## How Has This Been Tested?
  Built locally

  ## Breaking Changes
  None

  ## Checklist:
    _Go over all the following points, and put an `x` in all the boxes that apply._
  - [x] I have performed a self-review of my own code
  - [ ] I have commented my code, particularly in hard-to-understand areas
  - [ ] I have added or updated relevant unit/integration/functional/e2e tests
  - [ ] I have made corresponding changes to the documentation
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  knst:
    utACK 397a157
  UdjinM6:
    utACK 397a157

Tree-SHA512: aac741c274449cf9c22625ac95e964d275dd1454647eaa2fc064c90b44a215e5509c9f9b8aceccbd49002edc21cbea883900f202204aa2138c104f4989ae5e26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants