-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lnd take ages to start up #1648
Comments
I just wrote simple reconnecting script using list of inactive channels and ConnectPeer gRPC call (as well as I think this may be relevant to this issue, especially if lnd doesn't do connection logic in parallel (?). Some reasonable timeout for both RPC call and startup reconnection would be useful. |
Okay, I've found |
There is some problem with reconnecting to peers. Dunno why but it reconnects to some and never to others. This peer is a good example: 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b I lose connection to him at least daily and it never reconnect. |
We'll try to connect to them, but then exponentially back off if a connection fails. There's a flaw right now though, where we won't reset the back off once the peer is know to be "stable". |
I'm running on latest master (commit=ad7f87ef18af2e26ec80b9d6d5be7086515fea8a) and the problem supposedly fixed by #1628 didn't improve. 10+ minutes of uptime and still 0 peers. |
Sounds like an issue with peer bootstrapping, which is not altered by that PR. |
Oh yeah peer bootstrapping is #1658. |
What do you see in the logs immediately after start up? |
If you check out #1658, is the start up snappier? Basically right now we'll block the server on finishing the handshake with each peer, this instead makes all that async so we can continue to seek out and accept connections from other peers. |
Now that #1658 this should be resolved. Let us know if yu're having any further issues in this area, and we'll re-open the issue! Thanks. |
I don't think this issue has been fully resolved. I was unable to try new version until today:
Now there're hundreds lines like (see timestamps) ended with RPC thread starting. So it already does something for one minute, even before touching P2P network :
Then there's another flood of messages, we're almost 4 minutes of uptime:
Listening on 9735 starts after 10 minutes of startup, and immediately we're receiving inbound connections.
So I expect the issue is not related to P2P timeouts, because it happen even before the node tries to talk to them. |
Possibly related to chain rescans at startup, which we are working on improving. Will keep the issue open to track this! |
All the chain rescan are async now, so this would probably point to something else slowing down startup. |
Just seems like txindex is off. Performance improvements for nodes w/o are incoming |
Should be resolved by the upcoming revival of the height hint cache! |
Hi @slush0, we just merged the height hint cache? Can you give the current master a spin now? The first restart will still take some time, the second one however, should be much faster as the rescan information will now be cached. |
Also we recently merged #2211, which should speed up the initial rescan if a channel was closed while your node was offline. |
Background
lnd needs around 10-15 minutes after startup to actually connecting to peers. There's no unexpected activity (network or high CPU load). Then it always start talking to peer, but it looks like it is stuck on some timeouts (maybe waiting to timeouts to unreachable peers, serially?)
After this time, node normally start up with 300+ peers.
I started to notice this kind of behavior after it started maintaining high amount of peers, but it might be a coincidence.
Your environment
Linux server01 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux
Steps to reproduce
Any restart of lnd.
Expected behaviour
After lnd restart, lnd should open connections to peers and start communicating.
Actual behaviour
Log is filling with lines like this, but no other activity can be seen.
getinfo:
The text was updated successfully, but these errors were encountered: