Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

komodod with mining on sometimes appears to hang #554

Closed
dimxy opened this issue Jul 18, 2022 · 2 comments
Closed

komodod with mining on sometimes appears to hang #554

dimxy opened this issue Jul 18, 2022 · 2 comments

Comments

@dimxy
Copy link
Collaborator

dimxy commented Jul 18, 2022

komodod executable sometimes hangs with its rpcs becoming not responsible. No crash happens though, the process remains in memory and the only option is to kill it.
The backtrace analysis with gdb debugger shows that it seems to be a deadlock:
threads 24 and 23 are waiting on two different critical sections (cs) but appear to have already made locks on those cs mutually:

Thread 23 (processing incoming messages) is waiting on the pqueue->ControlMutex in CCheckQueueControl::CCheckQueueControl() and it should have made a lock on cs_main already in ActivateBestChain():

Thread 23 (Thread 0x7fcc36ffd700 (LWP 7827)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fcced9e7025 in __GI___pthread_mutex_lock (
    mutex=mutex@entry=0x561e336a61c8 <scriptcheckqueue+264>)
    at ../nptl/pthread_mutex_lock.c:80
#2  0x0000561e3254c186 in boost::posix::pthread_mutex_lock (
    m=0x561e336a61c8 <scriptcheckqueue+264>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/pthread/pthread_helpers.hpp:79
#3  boost::mutex::lock (this=0x561e336a61c8 <scriptcheckqueue+264>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/pthread/mutex.hpp:61
#4  CCheckQueueControl<CScriptCheck>::CCheckQueueControl (
    pqueueIn=0x561e336a60c0 <scriptcheckqueue>, this=0x7fcc36ffa980) at checkqueue.h:203
#5  ConnectBlock (block=..., state=..., pindex=<optimized out>, 
    pindex@entry=0x7fcc2417f210, view=..., fJustCheck=fJustCheck@entry=false, 
    fCheckPOW=fCheckPOW@entry=true) at main.cpp:3543
#6  0x0000561e32554fd3 in ConnectTip (pblock=0x7fcc36ffbda0, pindexNew=0x7fcc2417f210, 
    state=...) at main.cpp:4273
#7  ActivateBestChainStep (fSkipdpow=fSkipdpow@entry=false, state=..., 
    pindexMostWork=pindexMostWork@entry=0x7fcc2417f210, pblock=<optimized out>)
    at main.cpp:4551
#8  0x0000561e32556d99 in ActivateBestChain (fSkipdpow=fSkipdpow@entry=false, state=..., 
    pblock=pblock@entry=0x7fcc36ffbda0) at main.cpp:4612
---Type <return> to continue, or q <return> to quit---
#9  0x0000561e32557608 in ProcessNewBlock (from_miner=from_miner@entry=false, 
    height=height@entry=0, state=..., pfrom=pfrom@entry=0x7fcc2400d8a0, 
    pblock=pblock@entry=0x7fcc36ffbda0, fForceProcessing=<optimized out>, dbp=0x0)
    at main.cpp:5818
#10 0x0000561e3255d5cc in ProcessMessage (pfrom=pfrom@entry=0x7fcc2400d8a0, 
    strCommand="block", vRecv=..., nTimeReceived=nTimeReceived@entry=1657804006280505)
    at main.cpp:8001
#11 0x0000561e3255f472 in ProcessMessages (pfrom=0x7fcc2400d8a0) at main.cpp:8248
#12 0x0000561e32561b86 in boost::detail::function::function_invoker1<bool (*)(CNode*), bool, CNode*>::invoke (function_ptr=..., a0=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/function/function_template.hpp:100
#13 0x0000561e325d55bf in boost::function1<bool, CNode*>::operator() (a0=<optimized out>, 
    this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/function/function_template.hpp:764
#14 boost::signals2::detail::call_with_tuple_args<bool>::m_invoke<boost::function<bool (CNode*)>, 0u, CNode*&>(boost::function<bool (CNode*)>&, boost::signals2::detail::unsigned_meta_array<0u>, std::tuple<CNode*&> const&, boost::disable_if<boost::is_void<boost::function<bool (CNode*)>::result_type>, void>::type*) const (args=std::tuple containing = {...}, 
    func=..., this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/variadic_slot_invoker.hpp:98
#15 boost::signals2::detail::call_with_tuple_args<bool>::operator()<boost::function<bool (CNode*)>, CNode*&, 1ul>(boost::function<bool (CNode*)>&, std::tuple<CNode*&> const&, mpl_::size_t<1ul>) const (args=std::tuple containing = {...}, func=..., this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/variadic_slot_invoker.hpp:90
#16 boost::signals2::detail::variadic_slot_invoker<bool, CNode*>::operator()<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >(boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > const&) const (connectionBody=..., this=0x7fcc36ffcb30)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/variadic_slot_invoker.hpp:134
#17 boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> >::dereference() const (this=0x7fcc36ffc6f0)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/slot_call_iterator.hpp:110
#18 boost::iterators::iterator_core_access::dereference<boost::signals2::detail::slot_call_i---Type <return> to continue, or q <return> to quit---
terator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >(boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > const&) (f=...)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/iterator/iterator_facade.hpp:550
#19 boost::iterators::detail::iterator_facade_base<boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> >, bool, boost::iterators::single_pass_traversal_tag, bool const&, long, false, false>::operator*() const (
    this=0x7fcc36ffc6f0)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/iterator/iterator_facade.hpp:656
#20 CombinerAll::operator()<boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >(boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> >, boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> >) const (last=..., first=..., this=<optimized out>) at net.h:111
#21 boost::signals2::detail::combiner_invoker<bool>::operator()<CombinerAll, boost::signals2---Type <return> to continue, or q <return> to quit---
::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >(CombinerAll&, boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> >, boost::signals2::detail::slot_call_iterator_t<boost::signals2::detail::variadic_slot_invoker<bool, CNode*>, std::_List_iterator<boost::shared_ptr<boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> > >, boost::signals2::detail::connection_body<std::pair<boost::signals2::detail::slot_meta_group, boost::optional<int> >, boost::signals2::slot<bool (CNode*), boost::function<bool (CNode*)> >, boost::signals2::mutex> >) const (
    last=..., first=..., combiner=..., this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/result_type_wrapper.hpp:53
#22 boost::signals2::detail::signal_impl<bool (CNode*), CombinerAll, int, std::less<int>, boost::function<bool (CNode*)>, boost::function<bool (boost::signals2::connection const&, CNode*)>, boost::signals2::mutex>::operator()(CNode*) (args#0=<optimized out>, 
    this=0x561e35623a60)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/signal_template.hpp:247
#23 boost::signals2::signal<bool (CNode*), CombinerAll, int, std::less<int>, boost::function<bool (CNode*)>, boost::function<bool (boost::signals2::connection const&, CNode*)>, boost::signals2::mutex>::operator()(CNode*) (args#0=<optimized out>, 
    this=0x561e336f8118 <g_signals+24>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/signals2/detail/signal_template.hpp:722
#24 ThreadMessageHandler () at net.cpp:1604
#25 0x0000561e324f0f45 in TraceThread<void (*)()> (name=<optimized out>, 
    func=0x561e325d4f0b <ThreadMessageHandler()>) at ./util.h:272
#26 0x0000561e324eae99 in boost::_bi::list2<boost::_bi::value<char const*>, boost::_bi::value<void (*)()> >::operator()<void (*)(char const*, void (*)()), boost::_bi::list0> (
    a=<synthetic pointer>..., f=<optimized out>, this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/bind/bind.hpp:319
#27 boost::_bi::bind_t<void, void (*)(char const*, void (*)()), boost::_bi::list2<boost::_bi::value<char const*>, boost::_bi::value<void (*)()> > >::operator() (this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/bind/bind.hpp:1294
#28 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(char const*, void (*)()), b---Type <return> to continue, or q <return> to quit---
oost::_bi::list2<boost::_bi::value<char const*>, boost::_bi::value<void (*)()> > > >::run (
    this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/detail/thread.hpp:120
#29 0x0000561e32baa43c in thread_proxy ()
#30 0x00007fcced9e46db in start_thread (arg=0x7fcc36ffd700) at pthread_create.c:463
#31 0x00007fccecd2461f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 24 (mining thread) is waiting on cs_main but it should have made a lock on pqueue->ControlMutex in ConnectBlock():

Thread 24 (Thread 0x7fcc367fc700 (LWP 7828)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fcced9e70f4 in __GI___pthread_mutex_lock (
    mutex=mutex@entry=0x561e336a5a20 <cs_main>) at ../nptl/pthread_mutex_lock.c:115
#2  0x0000561e324f58b9 in boost::posix::pthread_mutex_lock (m=0x561e336a5a20 <cs_main>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/pthread/pthread_helpers.hpp:79
#3  boost::recursive_mutex::lock (this=0x561e336a5a20 <cs_main>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/pthread/recursive_mutex.hpp:108
#4  AnnotatedMixin<boost::recursive_mutex>::lock (this=0x561e336a5a20 <cs_main>)
    at sync.h:76
#5  boost::unique_lock<AnnotatedMixin<boost::recursive_mutex> >::lock (this=0x7fcc367f6740)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/lock_types.hpp:346
#6  CMutexLock<AnnotatedMixin<boost::recursive_mutex> >::Enter (nLine=<optimized out>, 
    pszFile=<optimized out>, pszName=<optimized out>, this=0x7fcc367f6740) at sync.h:132
#7  CMutexLock<AnnotatedMixin<boost::recursive_mutex> >::CMutexLock (this=0x7fcc367f6740, 
    mutexIn=..., pszName=<optimized out>, pszFile=<optimized out>, nLine=<optimized out>, 
    fTry=<optimized out>) at sync.h:153
#8  0x0000561e3251c94e in GetSpendHeight (inputs=...) at main.cpp:2764
#9  0x0000561e32526afd in ContextualCheckInputs (tx=..., state=..., inputs=..., 
    fScriptChecks=<optimized out>, flags=flags@entry=513, 
    cacheStore=cacheStore@entry=false, txdata=..., consensusParams=..., 
    consensusBranchId=1991772603, pvChecks=0x7fcc367f6b70) at main.cpp:2884
#10 0x0000561e3254d440 in ConnectBlock (block=..., state=..., pindex=<optimized out>, 
    pindex@entry=0x7fcc367f71d0, view=..., fJustCheck=fJustCheck@entry=true, 
    fCheckPOW=fCheckPOW@entry=true) at main.cpp:3676
#11 0x0000561e32560be9 in TestBlockValidity (state=..., block=..., 
    pindexPrev=0x7fcc248a1e30, fCheckPOW=fCheckPOW@entry=true, 
    fCheckMerkleRoot=fCheckMerkleRoot@entry=false) at main.cpp:5853
#12 0x0000561e325b58cb in <lambda(std::vector<unsigned char, std::allocator<unsigned char> >)>::operator()(std::vector<unsigned char, std::allocator<unsigned char> >) const (
    __closure=0x7fcc1875c480, soln=std::vector of length 1344, capacity 1344 = {...})
    at miner.cpp:1564
#13 0x0000561e325b6829 in std::_Function_handler<bool(std::vector<unsigned char, std::allocator<unsigned char> >), BitcoinMiner(CWallet*)::<lambda(std::vector<unsigned char, std::allocator<unsigned char> >)> >::_M_invoke(const std::_Any_data &, std::vector<unsigned char, std::allocator<unsigned char> > &&) (__functor=..., __args#0=...)
---Type <return> to continue, or q <return> to quit---
    at /usr/include/c++/7/bits/std_function.h:302
#14 0x0000561e325c2fee in std::function<bool (std::vector<unsigned char, std::allocator<unsigned char> >)>::operator()(std::vector<unsigned char, std::allocator<unsigned char> >) const (this=this@entry=0x7fcc367f8240, __args#0=std::vector of length 0, capacity 0)
    at /usr/include/c++/7/bits/std_function.h:706
#15 0x0000561e325befd3 in BitcoinMiner (pwallet=<optimized out>) at miner.cpp:1628
#16 0x0000561e325bfd0c in boost::_bi::list1<boost::_bi::value<CWallet*> >::operator()<void (*)(CWallet*), boost::_bi::list0> (a=<synthetic pointer>..., f=<optimized out>, 
    this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/bind/bind.hpp:259
#17 boost::_bi::bind_t<void, void (*)(CWallet*), boost::_bi::list1<boost::_bi::value<CWallet*> > >::operator() (this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/bind/bind.hpp:1294
#18 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(CWallet*), boost::_bi::list1<boost::_bi::value<CWallet*> > > >::run (this=<optimized out>)
    at /home/ubuntu/komodo/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/detail/thread.hpp:120
#19 0x0000561e32baa43c in thread_proxy ()
#20 0x00007fcced9e46db in start_thread (arg=0x7fcc367fc700) at pthread_create.c:463
#21 0x00007fccecd2461f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

It looks like the root of the issue is that there is lack of synchronisation of those two threads entering into ConnectBlock() so they create a dead lock. It seems there is a missing lock on the cs_main critical section before TestBlockValidity call in the ValidBlock lambda function in miner.cpp, this lock would provide the needed synchronisation.

there is a related issue #483

@dimxy
Copy link
Collaborator Author

dimxy commented Jan 10, 2023

More explanation on the issue:
The deadlock occurs because two threads, one is calling ProcessMessages() and another is a miner thread calling TestBlockValidity(), both try to acquire locks on cs_main and pqueue->ControlMutex in reverse order.
To prevent that a LOCK(cs_main) is supposed before TestBlockValidity() call in miner.cpp which is not made.

@dimxy
Copy link
Collaborator Author

dimxy commented Mar 29, 2023

fixed in #559

@dimxy dimxy closed this as completed Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant