Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lock order debugging and potential deadlocks #1612

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5267,13 +5267,24 @@ bool ProcessMessages(CNode* pfrom)
return fOk;
}

// Note: this function requires a lock on cs_main before calling. (See below comments.)
bool SendMessages(CNode* pto, bool fSendTrickle)
{
// Some comments and TODOs in order...
// 1. This function never returns anything but true... (try to find a return other than true).
// 2. The try lock inside this function causes a potential deadlock due to a lock order reversal in main.
// 3. The reason for the interior lock is vacated by 1. So the below is commented out, and moved to
// the ThreadMessageHandler2 in net.cpp.
// 4. We need to research why we never return false at all, and subordinately, why we never consume
// the value of this function.

/*
// Treat lock failures as send successes in case the caller disconnects
// the node based on the return value.
TRY_LOCK(cs_main, lockMain);
if(!lockMain)
return true;
*/

// Don't send anything until we get their version message
if (pto->nVersion == 0)
Expand Down
20 changes: 17 additions & 3 deletions src/net.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2072,9 +2072,23 @@ void ThreadMessageHandler2(void* parg)

// Send messages
{
TRY_LOCK(pnode->cs_vSend, lockSend);
if (lockSend)
SendMessages(pnode, pnode == pnodeTrickle);
// Having the outer cs_main TRY_LOCK here with reversed logic
// has the same effect as the original TRY_LOCK in Sendmessages,
// which was if !lockMain return true immediately. (Note
// that the return value of SendMessages was never consumed.
// and this is the only place in the code where SendMessages is
// called. This is to eliminate a potential deadlock condition
// due to logic order reversal.

TRY_LOCK(cs_main, lockMain);
if(lockMain)
{
TRY_LOCK(pnode->cs_vSend, lockSend);
if (lockSend)
{
SendMessages(pnode, pnode == pnodeTrickle);
}
}
}

if (fShutdown)
Expand Down
1 change: 0 additions & 1 deletion src/sync.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,6 @@ static void potential_deadlock_detected(const std::pair<void*, void*>& mismatch,
}
LogPrintf(" %s", i.second.ToString());
}
assert(false);
}

static void push_lock(void* c, const CLockLocation& locklocation, bool fTry)
Expand Down