Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameterize ChannelManager by a Router #1812

Conversation

valentinewallace
Copy link
Contributor

@valentinewallace valentinewallace commented Oct 28, 2022

This will be used in upcoming work to fetch routes on-the-fly for trampoline payments.

It's all boilerplate.

Based on #1811
Based on #1862
Based on #1923
Based on #1928

@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from 2007c48 to 8c2e3f0 Compare October 28, 2022 15:45
@jkczyz jkczyz self-requested a review October 28, 2022 16:45
@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from 8c2e3f0 to 2a8d727 Compare October 28, 2022 20:25
@valentinewallace valentinewallace marked this pull request as draft October 28, 2022 20:28
@codecov-commenter
Copy link

codecov-commenter commented Oct 28, 2022

Codecov Report

Base: 90.75% // Head: 90.75% // Decreases project coverage by -0.00% ⚠️

Coverage data is based on head (04e31f1) compared to base (7d84a45).
Patch coverage: 84.14% of modified lines in pull request are covered.

❗ Current head 04e31f1 differs from pull request most recent head 9f1e473. Consider uploading reports for the commit 9f1e473 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1812      +/-   ##
==========================================
- Coverage   90.75%   90.75%   -0.01%     
==========================================
  Files          96       96              
  Lines       50082    50186     +104     
  Branches    50082    50186     +104     
==========================================
+ Hits        45453    45544      +91     
- Misses       4629     4642      +13     
Impacted Files Coverage Δ
lightning-block-sync/src/init.rs 91.11% <ø> (ø)
lightning/src/ln/payment_tests.rs 98.73% <ø> (ø)
lightning/src/ln/peer_handler.rs 55.82% <ø> (ø)
lightning/src/ln/priv_short_conf_tests.rs 96.54% <ø> (ø)
lightning/src/ln/reorg_tests.rs 100.00% <ø> (ø)
lightning/src/util/test_utils.rs 71.52% <52.17%> (-1.57%) ⬇️
lightning/src/util/ser_macros.rs 86.98% <60.00%> (-0.30%) ⬇️
lightning-background-processor/src/lib.rs 95.45% <100.00%> (+0.02%) ⬆️
lightning-invoice/src/payment.rs 89.48% <100.00%> (ø)
lightning-invoice/src/utils.rs 97.76% <100.00%> (+0.74%) ⬆️
... and 15 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch 3 times, most recently from 988459a to e5a793d Compare November 1, 2022 01:12
@valentinewallace valentinewallace marked this pull request as ready for review November 1, 2022 15:45
@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from e5a793d to e1e7053 Compare November 4, 2022 15:14
@jkczyz
Copy link
Contributor

jkczyz commented Nov 7, 2022

Something that just came to mind, do we care that a custom router may be need to do network I/O? This would be the case for delegating to a server for routing.

@valentinewallace
Copy link
Contributor Author

Something that just came to mind, do we care that a custom router may be need to do network I/O? This would be the case for delegating to a server for routing.

Oof. I guess with trampoline, we're moving towards a world where beefy LSP nodes handle routing, which arguably makes the use case less important. It does seem like a downside though

@TheBlueMatt
Copy link
Collaborator

Right, so I think it may actually be okay here - if we're doing this processing in process_pending_htlc_forwards (which I think is where we'll end up?), we can expose an async version that starts by fetching the routes it thinks its gonna need, and then goes into common sync code to actually send. This is gonna imply some changes, though, it means we'll want to take the Router, in one form or another, as an argument to process_pending_htlc_forwards (and, I guess, send), as we'll have to be able to accept both a sync and non-sync version.

@TheBlueMatt
Copy link
Collaborator

Needs rebase - I assume we're going to make the Router required explicitly now?

@valentinewallace
Copy link
Contributor Author

Yeah, although do we want to put off landing this until retries is closer?

@TheBlueMatt
Copy link
Collaborator

Yea, that sounds good to me, just checking to make sure you're not waiting on something here.

@valentinewallace
Copy link
Contributor Author

Moved two more commits over from #1916

Comment on lines 75 to 84
pub struct TestRouter {}
pub struct TestRouter<'a> {
pub network_graph: Arc<NetworkGraph<&'a TestLogger>>,
}

impl<'a> TestRouter<'a> {
pub fn new(network_graph: Arc<NetworkGraph<&'a TestLogger>>) -> Self {
Self { network_graph }
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we don't use DefaultRouter like elsewhere? Could possibly type alias TestRouter to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would complicate the test utils a bit because we'd need to also have a scorer. Can file an issue for follow-up if you prefer

Comment on lines +505 to +509
Arc<DefaultRouter<
Arc<NetworkGraph<Arc<L>>>,
Arc<L>,
Arc<Mutex<ProbabilisticScorer<Arc<NetworkGraph<Arc<L>>>, Arc<L>>>>
>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for specifying a router if we don't specify, say, a chain::Watch as ChainMonitor? More generally, when should we specify a parameterization here vs leaving the choice to the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see an issue with parameterizing by ChainMonitor and taking a Persist in Simple*ChanMan instead. If a good candidate for parameterization exists, seems like we should use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I think the idea for this type alias is to simply using Arcs not necessarily defining reasonable defaults. Then again, it's already parameterized with a KeysManager. 🤷‍♂️ At very least the docs should be updated accordingly. Just seems any such alias should be defined at a higher level if we are going to be opinionated. Or have another alias like DefaultArcChannelManager or the like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'd always interpreted it as a Simple manager that picks defaults for things you Definitely want to Just Use, and leaves the rest up to you. If we are just adding Arcs I'm not really sure its worth defining a public type alias?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the docs aren't super clear. It says "Defining these type aliases prevents issues such as overly long function definitions." And then goes on to state that KeysManager was chosen as a concrete type. But we don't do the same with ChainMonitor. We can have define the aliases for whatever reason we want, of course. Just looking for some consistency. Changing it may break users, though, if they are relying on it for their own parameterization, which is probably a low probability and can be easily fixed with their own alias.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the docs. Browsing the git blame, I think the ChainMonitor not being specified may be a legacy thing and could be updated for consistency (probably not in this PR though)

Comment on lines 254 to 249
pub fn set_next_update_ret(&self, next_ret: chain::ChannelMonitorUpdateStatus) {
self.update_rets.lock().unwrap().push_back(next_ret);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the change in semantics here intentional? Previously, only calling set_next_update_ret would result in first returning Completed and then repeatedly returning next_ret. Now, it results in repeatedly returning next_ret only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really intentional, the intention was to make this util support the tests added in #1916 while still supporting existing tests

@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from dc8cfea to fde0555 Compare January 3, 2023 16:35
@TheBlueMatt
Copy link
Collaborator

Needs rebase, sadly, should be super trivial, though.

This will be used in upcoming work to fetch routes on-the-fly for payment
retries, which will no longer be the responsibility of InvoicePayer.
@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from fde0555 to 902b70c Compare January 3, 2023 20:39
@valentinewallace
Copy link
Contributor Author

Rebased

@TheBlueMatt
Copy link
Collaborator

Oops, CI is mad, I think its a conflict with #1934

@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from 902b70c to 04e31f1 Compare January 4, 2023 16:10
@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from 04e31f1 to 9f1e473 Compare January 4, 2023 17:50
Comment on lines +505 to +509
Arc<DefaultRouter<
Arc<NetworkGraph<Arc<L>>>,
Arc<L>,
Arc<Mutex<ProbabilisticScorer<Arc<NetworkGraph<Arc<L>>>, Arc<L>>>>
>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I think the idea for this type alias is to simply using Arcs not necessarily defining reasonable defaults. Then again, it's already parameterized with a KeysManager. 🤷‍♂️ At very least the docs should be updated accordingly. Just seems any such alias should be defined at a higher level if we are going to be opinionated. Or have another alias like DefaultArcChannelManager or the like.

pub fn set_next_update_ret(&self, next_ret: Option<chain::ChannelMonitorUpdateStatus>) {
*self.next_update_ret.lock().unwrap() = next_ret;
/// Queue an update status to return.
pub fn set_update_ret(&self, next_ret: chain::ChannelMonitorUpdateStatus) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this push_update_ret? Would make the call sites more sensible, IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd rather do that in follow-up though because there are a lot of callsites

valentinewallace and others added 6 commits January 5, 2023 11:23
Useful in upcoming work when for payment retries.
This is useful in the type serialization definition macros to avoid
writing or reading a field at all, simply using a static value on
each reload.
.. to disamgibutate from check_encoded_tlv_order
@valentinewallace valentinewallace force-pushed the 2022-10-chanman-router-param branch from 9f1e473 to 19516c0 Compare January 5, 2023 16:29
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge as is and do any follow-ups later.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First comment can just be fixed in a later PR, second one needs a followup.

@@ -6574,6 +6606,11 @@ pub struct ChannelManagerReadArgs<'a, M: Deref, T: Deref, K: Deref, F: Deref, L:
/// used to broadcast the latest local commitment transactions of channels which must be
/// force-closed during deserialization.
pub tx_broadcaster: T,
/// The router which will be used in the ChannelManager in the future for finding routes
/// on-the-fly for trampoline payments. Absent in private nodes that don't support forwarding.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last sentence isn't true anymore.

/// If this is set to Some(), after the next return, we'll always return this until update_ret
/// is changed:
pub next_update_ret: Mutex<Option<chain::ChannelMonitorUpdateStatus>>,
/// The queue of update statuses we'll return. If none are queued, ::Completed will always be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes our tests very brittle - if something changes which causes us to persist an extra monitor because of some commitment-update-ordering or batching changes, suddenly we'll run out of monitor update results and the test behavior will change. Instead, we should consider tracking whether monitor update status results were set at all, and if they were, panic if we try to get a result and no results are available, and also panic on Drop if results were not completely used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, we should consider tracking whether monitor update status results were set at all, and if they were, panic if we try to get a result and no results are available

To confirm, the goal with this would be to force tests to explicitly adapt to e.g. a new monitor update rather than having their behavior silently change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, a test must either (a) never call set_update_ret or (b) call it exact the same number of times as there are persistence events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so that causes 30 test failures. They don't look too difficult, but I hadn't realized how deep this refactor was and I think I can rewrite the tests to avoid needing it more easily than complete it. Thoughts on reverting this commit?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with that, too. I mean I do think this commit cleaned up the test util here, and the above suggestions would clean it up even more, but if we really don't want to go down that path we can walk it back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll probably go through with the refactor in parallel to #1916 then. May not be able to follow up on this until sometime next week.

@TheBlueMatt TheBlueMatt merged commit b79ff71 into lightningdevkit:main Jan 5, 2023
@TheBlueMatt TheBlueMatt mentioned this pull request Jan 5, 2023
k0k0ne pushed a commit to bitlightlabs/rust-lightning that referenced this pull request Sep 30, 2024
0.0.114 - Mar 3, 2023 - "Faster Async BOLT12 Retries"

API Updates
===========

 * `InvoicePayer` has been removed and its features moved directly into
   `ChannelManager`. As such it now requires a simplified `Router` and supports
   `send_payment_with_retry` (and friends). `ChannelManager::retry_payment` was
   removed in favor of the automated retries. Invoice payment utilities in
   `lightning-invoice` now call the new code (lightningdevkit#1812, lightningdevkit#1916, lightningdevkit#1929, lightningdevkit#2007, etc).
 * `Sign`/`BaseSign` has been renamed `ChannelSigner`, with `EcdsaChannelSigner`
   split out in anticipation of future schnorr/taproot support (lightningdevkit#1967).
 * The catch-all `KeysInterface` was split into `EntropySource`, `NodeSigner`,
   and `SignerProvider`. `KeysManager` implements all three (lightningdevkit#1910, lightningdevkit#1930).
 * `KeysInterface::get_node_secret` is now `KeysManager::get_node_secret_key`
   and is no longer required for external signers (lightningdevkit#1951, lightningdevkit#2070).
 * A `lightning-transaction-sync` crate has been added which implements keeping
   LDK in sync with the chain via an esplora server (lightningdevkit#1870). Note that it can
   only be used on nodes that *never* ran a previous version of LDK.
 * `Score` is updated in `BackgroundProcessor` instead of via `Router` (lightningdevkit#1996).
 * `ChainAccess::get_utxo` (now `UtxoAccess`) can now be resolved async (lightningdevkit#1980).
 * BOLT12 `Offer`, `InvoiceRequest`, `Invoice` and `Refund` structs as well as
   associated builders have been added. Such invoices cannot yet be paid due to
   missing support for blinded path payments (lightningdevkit#1927, lightningdevkit#1908, lightningdevkit#1926).
 * A `lightning-custom-message` crate has been added to make combining multiple
   custom messages into one enum/handler easier (lightningdevkit#1832).
 * `Event::PaymentPathFailure` is now generated for failure to send an HTLC
   over the first hop on our local channel (lightningdevkit#2014, lightningdevkit#2043).
 * `lightning-net-tokio` no longer requires an `Arc` on `PeerManager` (lightningdevkit#1968).
 * `ChannelManager::list_recent_payments` was added (lightningdevkit#1873).
 * `lightning-background-processor` `std` is now optional in async mode (lightningdevkit#1962).
 * `create_phantom_invoice` can now be used in `no-std` (lightningdevkit#1985).
 * The required final CLTV delta on inbound payments is now configurable (lightningdevkit#1878)
 * bitcoind RPC error code and message are now surfaced in `block-sync` (lightningdevkit#2057).
 * Get `historical_estimated_channel_liquidity_probabilities` was added (lightningdevkit#1961).
 * `ChannelManager::fail_htlc_backwards_with_reason` was added (lightningdevkit#1948).
 * Macros which implement serialization using TLVs or straight writing of struct
   fields are now public (lightningdevkit#1823, lightningdevkit#1976, lightningdevkit#1977).

Backwards Compatibility
=======================

 * Any inbound payments with a custom final CLTV delta will be rejected by LDK
   if you downgrade prior to receipt (lightningdevkit#1878).
 * `Event::PaymentPathFailed::network_update` will always be `None` if an
   0.0.114-generated event is read by a prior version of LDK (lightningdevkit#2043).
 * `Event::PaymentPathFailed::all_paths_removed` will always be false if an
   0.0.114-generated event is read by a prior version of LDK. Users who rely on
   it to determine payment retries should migrate to `Event::PaymentFailed`, in
   a separate release prior to upgrading to LDK 0.0.114 if downgrading is
   supported (lightningdevkit#2043).

Performance Improvements
========================

 * Channel data is now stored per-peer and channel updates across multiple
   peers can be operated on simultaneously (lightningdevkit#1507).
 * Routefinding is roughly 1.5x faster (lightningdevkit#1799).
 * Deserializing a `NetworkGraph` is roughly 6x faster (lightningdevkit#2016).
 * Memory usage for a `NetworkGraph` has been reduced substantially (lightningdevkit#2040).
 * `KeysInterface::get_secure_random_bytes` is roughly 200x faster (lightningdevkit#1974).

Bug Fixes
=========

 * Fixed a bug where a delay in processing a `PaymentSent` event longer than the
   time taken to persist a `ChannelMonitor` update, when occurring immediately
   prior to a crash, may result in the `PaymentSent` event being lost (lightningdevkit#2048).
 * Fixed spurious rejections of rapid gossip sync data when the graph has been
   updated by other means between gossip syncs (lightningdevkit#2046).
 * Fixed a panic in `KeysManager` when the high bit of `starting_time_nanos`
   is set (lightningdevkit#1935).
 * Resolved an issue where the `ChannelManager::get_persistable_update_future`
   future would fail to wake until a second notification occurs (lightningdevkit#2064).
 * Resolved a memory leak when using `ChannelManager::send_probe` (lightningdevkit#2037).
 * Fixed a deadlock on some platforms at least when using async `ChannelMonitor`
   updating (lightningdevkit#2006).
 * Removed debug-only assertions which were reachable in threaded code (lightningdevkit#1964).
 * In some cases when payment sending fails on our local channel retries no
   longer take the same path and thus never succeed (lightningdevkit#2014).
 * Retries for spontaneous payments have been fixed (lightningdevkit#2002).
 * Return an `Err` if `lightning-persister` fails to read the directory listing
   rather than panicing (lightningdevkit#1943).
 * `peer_disconnected` will now never be called without `peer_connected` (lightningdevkit#2035)

Security
========

0.0.114 fixes several denial-of-service vulnerabilities which are reachable from
untrusted input from channel counterparties or in deployments accepting inbound
connections or channels. It also fixes a denial-of-service vulnerability in rare
cases in the route finding logic.
 * The number of pending un-funded channels as well as peers without funded
   channels is now limited to avoid denial of service (lightningdevkit#1988).
 * A second `channel_ready` message received immediately after the first could
   lead to a spurious panic (lightningdevkit#2071). This issue was introduced with 0conf
   support in LDK 0.0.107.
 * A division-by-zero issue was fixed in the `ProbabilisticScorer` if the amount
   being sent (including previous-hop fees) is equal to a channel's capacity
   while walking the graph (lightningdevkit#2072). The division-by-zero was introduced with
   historical data tracking in LDK 0.0.112.

In total, this release features 130 files changed, 21457 insertions, 10113
deletions in 343 commits from 18 authors, in alphabetical order:
 * Alec Chen
 * Allan Douglas R. de Oliveira
 * Andrei
 * Arik Sosman
 * Daniel Granhão
 * Duncan Dean
 * Elias Rohrer
 * Jeffrey Czyz
 * John Cantrell
 * Kurtsley
 * Matt Corallo
 * Max Fang
 * Omer Yacine
 * Valentine Wallace
 * Viktor Tigerström
 * Wilmer Paulino
 * benthecarman
 * jurvis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants