-
Notifications
You must be signed in to change notification settings - Fork 926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: drains a channel to crash the daemon #2688
test: drains a channel to crash the daemon #2688
Conversation
So much to unpack here! First, "spendable_msat" is unreliable, as you've discovered. It's too simplistic for a funder, since we have to pay for tx fees for the enlarged tx as well (known issue, but fairly easy to fix). Secondly, even when I change riskfactor to 0, it can't find a route. That's because we consider our own fee. I did that because we want to prioritize cheap channels over expensive ones, even though we're not paying it ourselves. But correctness comes first, so I think I need to change that (it's hard to have both, unfortunately.). Let me fix those two first, then get to the Real Issue. |
|
First hop is always free (the last hop is free only if it is the first hop), where hop refers to a channel. The routing algorithm weighs the first hop as part of the cost of the route, but it should not. (fixing this would have also fixed #2119 without weakening the route randomization |
Ok, I got nerd sniped and spent almost the whole day making spendable_msat accurate. Then I fixed routing. Still didn't get to the crash, that's tomorrow's job! But trust me, everything else (work-related) is on hold until this is fixed... |
Great. But how can you make spendable accurate without knowing the dynamic parameters (fees, risk, fuzz, htlc fee ...) in advance? |
|
In a certain way (depending on the viewpoint) opposite of spendable (receivable) is accurate, as the |
@rustyrussell What do you think about passing the exceeded amounts into the exception so they can be used by upper layers like this?: #2691 |
OK, how's this? I spent a lot of time fixing spendable_msat; it's not perfect but it's much better than it was. Then I actually fixed the bug you found. Finally, I explained what was happening with your test. Phew! |
@rustyrussell Wow, nice you tracked it down so fast and precisely! I was not really able to understand what was going wrong, only that it was going wrong. I will fixup some Travis complaints so we can get this polished and finished quickly. I can then also update my Michael |
We have the
|
Is there a way we can easily edit: did it with |
495cc5e
to
ee019f8
Compare
Rebasing is fine: I hit end of day Friday here and had to push what I had. Good point about the test breakage, just needs to put common/htlc_trim.o into the Makefile. |
@cdecker this is now green for tests. I will rebuild |
|
It's a choice though; we could choose not to send the final tx which would drive the channel into this state. |
8e0ccfe
to
958748a
Compare
@rustyrussell Hm, I made a funny testcase where I initially send just |
Yes, spendable_msat still isn't perfect, especially in corner cases. But to be honest, the complexity of calculating that field is getting out of hand; it now really belongs in a plugin. I had to stop somewhere :) |
958748a
to
ce0f903
Compare
|
Yes, I think adding that feedback for local failures is an excellent idea.
If the amount would create a new output (ie. htlc output is not trimmed), the funder needs to pay more onchain fees. This means there's a corner case, if the funder can't afford that: we can't send an htlc which would not be trimmed.
A plugin could do everything we're doing here, basically, but I'm not proposing a plugin. What I'm saying is that if we didn't have the spendable_msat field already, I would not add all this code and would tell people to write a plugin.
There are no HTLC fees if the HTLC is small enough to be trimmed.
That's whatever the peer tells us in the channel_announce message. The max field is optional, so if it doesn't tell us anything we just use the total channel size (we have no idea what their reserve is). |
ce0f903
to
c6bd7fe
Compare
@rustyrussell okay, thanks for all the clarification. We can consider this PR ready, as it fixes the crash, improves on route finding with the correct amount and adds tests. I did another rebase to solve conflicts, lets wait for Travis and we need a review from @cdecker . We can finish #2691 in the aftermath. Update: We have a test that does not complete after rebase: |
c6bd7fe
to
6ebcf68
Compare
Update: rebased again, now |
f8ef455
to
cc9e115
Compare
Consider this? diff --git a/channeld/full_channel.c b/channeld/full_channel.c
index c6aadc3a3..ee14eb310 100644
--- a/channeld/full_channel.c
+++ b/channeld/full_channel.c
@@ -540,7 +543,10 @@ static enum channel_add_err add_htlc(struct channel *channel,
adding,
removing,
channel->funder);
- if (htlc_fee) *htlc_fee = fee; /* set fee output pointer if given */
+ /* set fee output pointer if given (they want max) */
+ if (htlc_fee && amount_sat_greater(fee, *htlc_fee))
+ *htlc_fee = fee;
+
if (amount_msat_less_sat(balance, fee)) {
status_trace("Funder could not afford own fee %s with %s above reserve",
type_to_string(tmpctx,
@@ -556,7 +562,9 @@ static enum channel_add_err add_htlc(struct channel *channel,
adding,
removing,
!channel->funder);
- if (htlc_fee) *htlc_fee = fee; /* set fee output pointer if given */
+ /* set fee output pointer if given (they want max) */
+ if (htlc_fee && amount_sat_greater(fee, *htlc_fee))
+ *htlc_fee = fee;
if (amount_msat_less_sat(balance, fee)) {
status_trace("Funder could not afford peer's fee %s with %s above reserve",
type_to_string(tmpctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that was a lot to unpack, thanks to @rustyrussell for the detailed analysis, and the fixes 👍
ACK cc9e115
tests/test_gossip.py
Outdated
spendable_l2_bak = spendable_l2 | ||
while spendable_l1_bak == spendable_l1 or spendable_l2_bak == spendable_l2: | ||
spendable_l1 = l1.rpc.listpeers()['peers'][0]['channels'][0]['spendable_msat'] | ||
spendable_l2 = l2.rpc.listpeers()['peers'][0]['channels'][0]['spendable_msat'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to reduce the RPC pressure by using wait_for
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but its not in the final test after refactor. The following commit fixes it:
1aea063 pytest: extract separate tests that spendable_msat is accurate.
If you still insist I can rewrite the history (test has been moved also).
Okay, I will do another Update with the feedback. hopefully we can get this merged. |
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
This is where payment tests should go. Also mark it xfail for the moment, and remove developer-only tag (propagating gossip is only 60 seconds, which is OK). Signed-off-by: Rusty Russell <[email protected]>
Turns out we needed more comprehensive testing; we ended up with three separate tests. To avoid changing test_channel_drainage as we fix spendable_msat, I substituted raw numbers there. The first is a variation of the existing tests, testing we can't exceed spendable_msat, and we can pay it, both ways. The second is with a larger amount, which triggers a different problem. The final is with a giant channel, which tests our 2^32-1 msat cap. Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Take into account the fee we'd have to pay if we're the funder, and also drop to 0 if the amount is less than the smallest HTLC the peer will accept. Signed-off-by: Rusty Russell <[email protected]>
This means there's now a semantic difference between the default `fromid` and setting `fromid` explicitly to our own node_id. In the default case, it means we don't charge ourselves fees on the route. This means we can spend the full channel balance. We still want to consider the pricing of local channels, however: there's a *reason* to discount one over another, and that is to bias things. So we add the first-hop fee to the *risk* value instead. Signed-off-by: Rusty Russell <[email protected]>
The current calculation ignores them, which is unrealistic. Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Subtracting both arbitrarily reduces our capacity, even for ourselves since the routing logic uses this maximum. I also changed 'advertise' to 'advertize', since we use american spelling. Signed-off-by: Rusty Russell <[email protected]>
cc9e115
to
6c222bc
Compare
We track whether each change is affordable as we go; test_channel_drainage got us so close that the difference mattered; we hit an assert when we tried to commit the tx and realized we couldn't afford it. We should not be trying to add an HTLC if it will result in the funder being unable to afford it on either the local *or remote* commitments. Note the test still "fails" because it refuses to send the final payment. Signed-off-by: Rusty Russell <[email protected]>
Remove gratuitous prints, add explanations of what's going on, and demonstrate that we can add a final trimmed HTLC but not a non-trimmed one. Signed-off-by: Rusty Russell <[email protected]>
6c222bc
to
c69beb6
Compare
ACK c69beb6 Will wait for @rustyrussell to give it one more round and let him merge 👍 |
Test to reproducible crash a remote channeld
This adds a complicated test that demonstrates howto crash a remote
channeld
when intentionally draining a to a technical minimal value 'twice'.[l1] -> [l2]
setup with 1million satoshi.l1
sendingl2
the exact maximum value (976559200msat
) required to get around routing and HTLC fee checks. The steps are commented in source.13440800msat
htlc_fee:10860sat
=> second drain amount:2580800msat
).l2
is ac-lightning
node and tries to send a normal payment of i.e.100000sat
(100k sat). This will immideately crash thechanneld
with the attached stacktrace.l2
sends a smaller initial payment back, i.e. just10000sat
(10ksat), this will unlock the broken state. The channel is then normally operable again.l2
is a LND node, and will reject 'bigger' payments in this channel until a small enough one is being made first.Notes
l2
crashing will result in a unilateral close.drain
plugin, simply drain a channel 'twice' and try to use it afterwards. The drain plugin will split the amount in chunks to fit inbound capacities and estimate required HTLC for the last chunk.Stacktrace