Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No outbound peer connections when restarting & Sluggish startup #1569

Closed
juscamarena opened this issue Jul 17, 2018 · 9 comments
Closed

No outbound peer connections when restarting & Sluggish startup #1569

juscamarena opened this issue Jul 17, 2018 · 9 comments
Assignees
Labels
backend Related to the node backend software/interface (e.g. btcd, bitcoin-core) bug Unintended code behaviour P2 should be fixed if one has time

Comments

@juscamarena
Copy link
Contributor

Background

Describe your issue here.

Your environment

  • version of lnd 0.4.2-beta commit=6dff599d213be4182a70c8eab4cac8a36d3c15e4 but was also running into something similar on pervious commmits
  • which operating system (uname -a on *Nix)
    Ubuntu x86_64 GNU/Linux
  • version of btcd, bitcoind, or other backend
    btcd version 0.12.0-beta
  • any other relevant environment details

Steps to reproduce

Tell us how to reproduce this issue. Please provide stacktraces and links to code in question.

Expected behaviour

Tell us what should happen

Outbound connections

Actual behaviour

Tell us what happens instead

Here's what profiling shows:
https://paste.ubuntu.com/p/JxjrqWzrdw/
https://paste.ubuntu.com/p/QJ5k3rjGzn/

In previous commits LND would usually take 20ish minutes to startup and it would eventually. This time it refuses to connect outbound and I left it running overnight.

Seems after another restart I was able to connect to some peers but doesn't auto connect to any others.

2018-07-17 08:27:55.520 is where I restart, and any connections made following that I made manually.
LND logs
https://paste.ubuntu.com/p/4MgWjJqRB4/
Another profiler view from this point http://paste.ubuntu.com/p/tNXp4PX7jC/

@Roasbeef Roasbeef added the P3 might get fixed, nice to have label Jul 17, 2018
@Roasbeef
Copy link
Member

I've seen this a bit myself, I think the issue is that during a long running rescan, btcd doesn't prioritize regular RPC requests, so certain sub-systems are unable to fully start up. The profiles you've provided should be pretty helpful though, we'll take a look at those and see what the next steps to solve this issue are.

@Roasbeef Roasbeef added bug Unintended code behaviour backend Related to the node backend software/interface (e.g. btcd, bitcoin-core) labels Jul 17, 2018
@Roasbeef
Copy link
Member

Aside from the invoice stuff which #1578 should address, I think this is the most relevant section of the profiles:

1 @ 0x42e7aa 0x42e85e 0x4061a2 0x405e5b 0x833302 0x833583 0xb37539 0xb36da1 0xb358fc 0x45bd51
#	0x833301	github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient.receiveFuture+0x41				/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient/infrastructure.go:797
#	0x833301	github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient.FutureGetBlockVerboseResult.Receive+0x41	/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient/chain.go:119
#	0x833582	github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient.(*Client).GetBlockVerbose+0x42		/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient/chain.go:154
#	0xb37538	github.com/lightningnetwork/lnd/chainntnfs/btcdnotify.(*BtcdNotifier).confDetailsManually+0x1a8				/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/chainntnfs/btcdnotify/btcd.go:564
#	0xb36da0	github.com/lightningnetwork/lnd/chainntnfs/btcdnotify.(*BtcdNotifier).historicalConfDetails+0x90			/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/chainntnfs/btcdnotify/btcd.go:480
#	0xb358fb	github.com/lightningnetwork/lnd/chainntnfs/btcdnotify.(*BtcdNotifier).notificationDispatcher+0x98b			/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/chainntnfs/btcdnotify/btcd.go:308

Basically, if were blocked here to fetch this block, then most of the other sub systems aren't able to properly start up.

@Roasbeef Roasbeef added P2 should be fixed if one has time and removed P3 might get fixed, nice to have labels Jul 18, 2018
@Roasbeef
Copy link
Member

Roasbeef commented Jul 18, 2018

If you could give the two linked PR's a spin that would be super helpful. I think both of these in combo should resolve these slow start up issues. Thanks!

@juscamarena
Copy link
Contributor Author

juscamarena commented Jul 18, 2018

Running both PR's now and can now addinvoice quickly after startup.

Just started running it but seems stalled a bit with no peers.
Here's the profile: http://paste.ubuntu.com/p/3q3h5T5JFm/

I've also tried manually connecting to peers as shown in the logs below but seems it didn't stay connected and nothing else is being printed in the logs:
https://paste.ubuntu.com/p/Rhd4YdT33S/

Without the patches lnd did eventually start connecting outbound yesterday. Will follow up and see how long it takes now.

@Roasbeef
Copy link
Member

In that trace, we see the same issue of the RPC client waiting on btcd to return a block:

1 @ 0x42e7aa 0x42e85e 0x4061a2 0x405e5b 0x833302 0x833583 0xb37529 0xb36d91 0xb358ec 0x45bd51
#	0x833301	github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient.receiveFuture+0x41				/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient/infrastructure.go:797
#	0x833301	github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient.FutureGetBlockVerboseResult.Receive+0x41	/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient/chain.go:119
#	0x833582	github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient.(*Client).GetBlockVerbose+0x42		/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/vendor/github.com/btcsuite/btcd/rpcclient/chain.go:154
#	0xb37528	github.com/lightningnetwork/lnd/chainntnfs/btcdnotify.(*BtcdNotifier).confDetailsManually+0x1a8				/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/chainntnfs/btcdnotify/btcd.go:564
#	0xb36d90	github.com/lightningnetwork/lnd/chainntnfs/btcdnotify.(*BtcdNotifier).historicalConfDetails+0x90			/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/chainntnfs/btcdnotify/btcd.go:480
#	0xb358eb	github.com/lightningnetwork/lnd/chainntnfs/btcdnotify.(*BtcdNotifier).notificationDispatcher+0x98b			/home/ubuntu/gocode/src/github.com/lightningnetwork/lnd/chainntnfs/btcdnotify/btcd.go:308

@Roasbeef
Copy link
Member

So it would appear that the issue might actually be with btcd. btcd supports the same profiling (even same arg) that lnd does, so I think a set of profiles or goroutiune dumps for when btcd when lnd is tarting up would allow us to get to the bottom of the issue at hand.

@Roasbeef
Copy link
Member

Related PR btcsuite/btcd#1228

@Roasbeef
Copy link
Member

With that PR merged, can you try updating your btcd and giving it another shot? The other two related PR's have also been merged as well (async invoice ntfn, async rescan).

@juscamarena
Copy link
Contributor Author

I've applied @wpaulino patch to make use of txindex and not scan back all the way to the channel opening time manually. Seems that made everything snappy, I haven't tried the btcd patch but can say for sure it was the ghost pendingchannels and having many of them taking ages to rescan for a txid that won't ever appear. Any way to fix the wrong balance or do I have to manually rescan everything like I did last time?

I applied this patch: https://pastebin.com/z4b2WMYk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to the node backend software/interface (e.g. btcd, bitcoin-core) bug Unintended code behaviour P2 should be fixed if one has time
Projects
None yet
Development

No branches or pull requests

3 participants