Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use EffectiveCapacity in Score trait #1456

Merged
merged 6 commits into from
May 20, 2022

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Apr 27, 2022

Pass the EffectiveCapacity to Score::channel_penalty_msat so that scorers can penalize based on the type of channel (e.g., direct channels) and any other knowledge about it. For direct channels, the channel liquidity is known with certainty. Use this knowledge in ProbabilisticScorer by either penalizing with the per-hop penalty or u64::max_value depending on the amount.

Additionally, distinguish a channel's maximum HTLC from its capacity such that multiple HTLCs not exceeding the maximum HTLC value can be used for a single payment. Previously, the total value of all HTLCs for a payment was not allowed to exceed the channel's maximum HTLC limit.

Additionally, change EffectiveCapacity to prefer the channel's capacity over its maximum HTLC limit, but still use the latter for route finding. This allows for more accurate scoring.

Closes #1386

@TheBlueMatt
Copy link
Collaborator

Additionally, distinguish a channel's maximum HTLC from its capacity such that multiple HTLCs not exceeding the maximum HTLC value can be used for a single payment. Previously, the total value of all HTLCs for a payment was not allowed to exceed the channel's maximum HTLC limit.

Hmm, I don't think we should do this - not only is it a bit "rude", but its also not clear that it will work in the general case - at least for our codebase (I haven't checked any others), the htlc_maximum_msat is set to the channel's max-htlc-in-flight, which is a total, not a per-htlc limit. For most channels on the network today (but not by default in LDK) its the same limit, but either way, I don't think we should do this.

@jkczyz
Copy link
Contributor Author

jkczyz commented Apr 28, 2022

Hmm, I don't think we should do this - not only is it a bit "rude", but its also not clear that it will work in the general case - at least for our codebase (I haven't checked any others), the htlc_maximum_msat is set to the channel's max-htlc-in-flight, which is a total, not a per-htlc limit. For most channels on the network today (but not by default in LDK) its the same limit, but either way, I don't think we should do this.

FYI, this can be reverted with a one-line change to the first commit. I think we still want the other changes there (i.e., re-defining effective capacity when the channel capacity is known for scoring purposes and accounting for direction so the correct maximum HTLC value is used).

Interestingly, I noticed that LDK uses one-tenth the channel capacity for max_htlc_value_in_flight_msat and the smaller of this and nine-tenths of the channel capacity for the announced HTLC maximum (i.e., always one-tenth).

fn get_holder_max_htlc_value_in_flight_msat(channel_value_satoshis: u64) -> u64 {
channel_value_satoshis * 1000 / 10 //TODO
}

pub fn get_announced_htlc_max_msat(&self) -> u64 {
return cmp::min(
// Upper bound by capacity. We make it a bit less than full capacity to prevent attempts
// to use full capacity. This is an effort to reduce routing failures, because in many cases
// channel might have been used to route very small values (either by honest users or as DoS).
self.channel_value_satoshis * 1000 * 9 / 10,
self.holder_max_htlc_value_in_flight_msat
);
}

@TheBlueMatt
Copy link
Collaborator

Interestingly, I noticed that LDK uses one-tenth the channel capacity for max_htlc_value_in_flight_msat and the smaller of this and nine-tenths of the channel capacity for the announced HTLC maximum (i.e., always one-tenth).

Yes, we do that because its probably a better security stance, though I think most folks may not want that, we have a PR to make it configurable out now.

@TheBlueMatt
Copy link
Collaborator

Needs rebase already sorry :( Agree we should do all the other stuff, just noting that I don't think we should make the max change.

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from a546076 to 9d3fdbc Compare April 28, 2022 22:53
@codecov-commenter
Copy link

codecov-commenter commented Apr 28, 2022

Codecov Report

Merging #1456 (cbc5e4f) into main (36817e0) will increase coverage by 0.03%.
The diff coverage is 96.67%.

❗ Current head cbc5e4f differs from pull request most recent head f259daf. Consider uploading reports for the commit f259daf to get more accurate results

@@            Coverage Diff             @@
##             main    #1456      +/-   ##
==========================================
+ Coverage   90.90%   90.93%   +0.03%     
==========================================
  Files          77       75       -2     
  Lines       42201    42032     -169     
  Branches    42201    42032     -169     
==========================================
- Hits        38364    38223     -141     
+ Misses       3837     3809      -28     
Impacted Files Coverage Δ
lightning-invoice/src/payment.rs 92.75% <ø> (-0.16%) ⬇️
lightning/src/routing/router.rs 92.44% <80.43%> (-0.09%) ⬇️
lightning/src/routing/scoring.rs 95.28% <98.96%> (-0.92%) ⬇️
lightning/src/routing/network_graph.rs 90.03% <100.00%> (+0.15%) ⬆️
lightning/src/ln/peer_handler.rs 53.54% <0.00%> (-3.53%) ⬇️
lightning-net-tokio/src/lib.rs 75.15% <0.00%> (-0.76%) ⬇️
lightning/src/util/events.rs 33.21% <0.00%> (-0.69%) ⬇️
lightning/src/chain/chainmonitor.rs 97.63% <0.00%> (-0.29%) ⬇️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 36817e0...f259daf. Read the comment docs.

Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@TheBlueMatt
Copy link
Collaborator

Hmm, so this is a pretty big performance regression - we end up dropping from 63-65ms for generate_mpp_routes_with_default_scorer to 72-76ms across three runs each. generate_routes_with_probabilistic_scorer goes from 34-39 to 41-43. A few immediate ideas that may be worth trying, though these are just initial thoughts:

  • We could cache the channel capacity in the liquidity map to avoid re-calculating it across path runs
  • IIRC in the past I'd noted that if we don't insert into the liquidity map at all unless we're trying to deduct from a channel after picking it for a path was faster, maybe try that as well
  • Calculate the liquidity and htlc_max at the same time instead of from two callsites

@jkczyz
Copy link
Contributor Author

jkczyz commented May 3, 2022

Hmm, so this is a pretty big performance regression - we end up dropping from 63-65ms for generate_mpp_routes_with_default_scorer to 72-76ms across three runs each. generate_routes_with_probabilistic_scorer goes from 34-39 to 41-43. A few immediate ideas that may be worth trying, though these are just initial thoughts:

Locally, I'm seeing a narrower difference. I'm pretty consistently seeing 54-56ms for generate_routes_with_probabilistic_scorer before and 57-59 ms after. For generate_mpp_routes_with_default_scorer, I'm seeing 84-85ms before and 86-88ms after.

  • We could cache the channel capacity in the liquidity map to avoid re-calculating it across path runs

Ah, are these called more than once per candidate across path runs? Somehow I was under the impression we would only call channel_penalty_msat once per hop.

Another alternative would be to precompute both effective_capacity and available_liquidity in DirectedChannelInfo upon creation which may remove some redundancy in addition to acting to cache those values.

  • IIRC in the past I'd noted that if we don't insert into the liquidity map at all unless we're trying to deduct from a channel after picking it for a path was faster, maybe try that as well

We'd still need to compute available_liquidity, though, right? So it would just save inserting into the map, IIUC?

  • Calculate the liquidity and htlc_max at the same time instead of from two callsites

Looks like htlc_maximum_msat is only called from available_liquidity. Are you seeing another place?

@TheBlueMatt
Copy link
Collaborator

Ah, are these called more than once per candidate across path runs? Somehow I was under the impression we would only call channel_penalty_msat once per hop.

That is true, but if we're mpp'ing we may run the whole thing again.

We'd still need to compute available_liquidity, though, right? So it would just save inserting into the map, IIUC?

Yea, I was surprised when I saw it, I may have done something else wrong when I did it.

Looks like htlc_maximum_msat is only called from available_liquidity. Are you seeing another place?

In the current PR we call $candidate.available_liquidity and $candidate.effective_capacity separately, both of which have a branch tree over a similar set of variables, a good chunk of which may well be conditional moves. Doing them at the same time may help because they can effectively run in parallel.

@jkczyz
Copy link
Contributor Author

jkczyz commented May 3, 2022

Interestingly, if I simply inline effective_capacity, I see the 54-55ms for generate_routes_with_probabilistic_scorer (i.e., the low end prior to the PR) and 90-92ms for generate_mpp_routes_with_default_scorer (i.e., higher than without inlining, though baseline was a little higher this time, too, oddly).

@TheBlueMatt
Copy link
Collaborator

Hmm, maybe LLVM is trying to be too clever (or the branch predictor gets mis-trained on the first run, or its speculatively executing...) and running the calculation even though its in the map on the mpp side?

@jkczyz
Copy link
Contributor Author

jkczyz commented May 3, 2022

Another alternative would be to precompute both effective_capacity and available_liquidity in DirectedChannelInfo upon creation which may remove some redundancy in addition to acting to cache those values.

Ended up trying this alternative with results nearly at or better than baseline. 48-50ms for generate_routes_with_probabilistic_scorer and 84-86ms for generate_mpp_routes_with_default_scorer.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so with the current PR rebased on upstream git (compared to upstream git) on my (rather dated, but otherwise running 0 other processes) Xeon E5-2687Wv3, on upstream I get:

test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  64,197,720 ns/iter (+/- 16,127,172)
test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  73,048,099 ns/iter (+/- 18,632,756)
test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  64,070,166 ns/iter (+/- 16,472,992)
test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  72,745,462 ns/iter (+/- 18,781,611)

test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  34,797,920 ns/iter (+/- 8,054,483)
test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  39,555,016 ns/iter (+/- 9,120,697)
test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  34,395,637 ns/iter (+/- 7,866,909)
test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  39,116,608 ns/iter (+/- 8,948,311)

here I get:

test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  70,664,369 ns/iter (+/- 25,250,649)
test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  79,796,083 ns/iter (+/- 28,318,648)
test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  78,950,337 ns/iter (+/- 28,135,917)
test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench:  76,527,245 ns/iter (+/- 27,194,683)

test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  39,876,164 ns/iter (+/- 12,172,601)
test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  40,562,187 ns/iter (+/- 12,510,269)
test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  44,109,397 ns/iter (+/- 13,310,276)
test routing::router::benches::generate_routes_with_probabilistic_scorer     ... bench:  43,525,739 ns/iter (+/- 13,309,311)

Some of the other scorers see a much larger regression, but that makes sense - we're now doing more work than we were for no change for eg the zero scorer, when previously some of that work could be optimized out, but its now calculated earlier so less clear to LLVM that it's unused.

I still don't feel great about this result.

.unwrap_or(EffectiveCapacity::Unknown);

// TODO: Replace with Option::zip when MSRV is 1.46
let htlc_maximum_msat = match (htlc_maximum_msat, capacity_msat) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this 8-line branching list would be cleaner as a single 6-line match. Further, can we get the effective_capacity calculation into the same match?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat meh on how this looks. Seems a bit messier but may also helped with improving performance on generate_mpp_routes_with_default_scorer. Haven't verified that, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I guess I'm just wired differently, this is infinitely more readable to me.

/// Returns the [`EffectiveCapacity`] of the channel in the direction.
///
/// This is either the total capacity from the funding transaction, if known, or the
/// `htlc_maximum_msat` for the direction as advertised by the gossip network, if known,
/// whichever is smaller.
/// whichever is larger.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we're back to smaller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, EffectiveCapacity is still the larger of the two. So if a user has the capacity through chain::Access, we'll have a more accurate liquidity upper bound. The scorer will use EffectiveCapacity but the routing algorithm will still use AvailableLiquidity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess its neither now? It just returns EffectiveCapacity::Total and ignores the htlc_max. I guess we should drop the "whichever is larger" line entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my mistake, you're right. Let me re-word.

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from 16e189c to 8e1aeb5 Compare May 4, 2022 05:47
Copy link
Contributor Author

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so with the current PR rebased on upstream git (compared to upstream git) on my (rather dated, but otherwise running 0 other processes) Xeon E5-2687Wv3, on upstream I get:

The variance on my machine seems much less, FWIW. The 3ms range is probably closer to 2.5ms but I've been rounding up for consistency. Throwing away outliers it's closer to a 2ms range or less when using 5 runs.

I still don't feel great about this result.

Anyhow, with the fixups I'm seeing similar results as before for generate_routes_with_probabilistic_scorer (48-50ms) and improved results for generate_mpp_routes_with_probabilistic_scorer (82-84ms).

/// Returns the [`EffectiveCapacity`] of the channel in the direction.
///
/// This is either the total capacity from the funding transaction, if known, or the
/// `htlc_maximum_msat` for the direction as advertised by the gossip network, if known,
/// whichever is smaller.
/// whichever is larger.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, EffectiveCapacity is still the larger of the two. So if a user has the capacity through chain::Access, we'll have a more accurate liquidity upper bound. The scorer will use EffectiveCapacity but the routing algorithm will still use AvailableLiquidity.

.unwrap_or(EffectiveCapacity::Unknown);

// TODO: Replace with Option::zip when MSRV is 1.46
let htlc_maximum_msat = match (htlc_maximum_msat, capacity_msat) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat meh on how this looks. Seems a bit messier but may also helped with improving performance on generate_mpp_routes_with_default_scorer. Haven't verified that, though.

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented May 5, 2022

The variance on my machine seems much less, FWIW. The 3ms range is probably closer to 2.5ms but I've been rounding up for consistency. Throwing away outliers it's closer to a 2ms range or less when using 5 runs.

Yea, I'm on a dual-CPU machine so it may be NUMA acting up, I tried the latest version on my laptop and it seems to be more consistent than my old NUMA workstation implies. That said, I re-tried my previous suggestion of replacing the available_channel_liquidities map with a simple "liquidity_used_per_channel" map things go wayyyy faster. Given you're swapping the type in it already it seems like it'd be worth it to do now. I benchmarked the following path and my probabilistic_scorer mpp runs went from ~60ms to ~50ms.

@@ -915,10 +908,7 @@ where L::Target: Logger {
                        // - for first and last hops early in get_route
                        if $src_node_id != $dest_node_id {
                                let short_channel_id = $candidate.short_channel_id();
-                               let available_liquidity = available_channel_liquidities
-                                       .entry((short_channel_id, $src_node_id < $dest_node_id))
-                                       .or_insert_with(|| $candidate.available_liquidity());
-                               let available_liquidity_msat = available_liquidity.as_msat();
+                               let initial_liq = $candidate.htlc_maximum_msat();
 
                                // It is tricky to subtract $next_hops_fee_msat from available liquidity here.
                                // It may be misleading because we might later choose to reduce the value transferred
@@ -927,7 +917,12 @@ where L::Target: Logger {
                                // fees caused by one expensive channel, but then this channel could have been used
                                // if the amount being transferred over this path is lower.
                                // We do this for now, but this is a subject for removal.
-                               if let Some(available_value_contribution_msat) = available_liquidity_msat.checked_sub($next_hops_fee_msat) {
+                               if let Some(mut available_value_contribution_msat) = initial_liq.checked_sub($next_hops_fee_msat) {
+                                       let used_liquidity_msat = if let Some(used) = available_channel_liquidities
+                                               .get(&(short_channel_id, $src_node_id < $dest_node_id)) {
+                                                       available_value_contribution_msat = available_value_contribution_msat.saturating_sub(*used);
+                                                       *used
+                                               } else { 0 };
 
                                        // Routing Fragmentation Mitigation heuristic:
                                        //
@@ -1075,7 +1070,7 @@ where L::Target: Logger {
 
                                                        let channel_usage = ChannelUsage {
                                                                amount_msat: amount_to_transfer_over_msat,
-                                                               inflight_htlc_msat: available_liquidity.in_use_liquidity_msat,
+                                                               inflight_htlc_msat: used_liquidity_msat,
                                                                effective_capacity: $candidate.effective_capacity(),
                                                        };
                                                        let channel_penalty_msat = scorer.channel_penalty_msat(
@@ -1461,11 +1456,9 @@ where L::Target: Logger {
                                for (prev_hop, (hop, _)) in prev_hop_iter.zip(payment_path.hops.iter()) {
                                        let spent_on_hop_msat = value_contribution_msat + hop.next_hops_fee_msat;
                                        let available_liquidity_msat = available_channel_liquidities
-                                               .get_mut(&(hop.candidate.short_channel_id(), *prev_hop < hop.node_id))
-                                               .unwrap()
-                                               .reduce_by(spent_on_hop_msat)
-                                               .as_msat();
-                                       if available_liquidity_msat == 0 {
+                                               .entry((hop.candidate.short_channel_id(), *prev_hop < hop.node_id))
+                                               .or_insert(spent_on_hop_msat);
+                                       if *available_liquidity_msat == hop.candidate.htlc_maximum_msat() {
                                                // If this path used all of this channel's available liquidity, we know
                                                // this path will not be selected again in the next loop iteration.
                                                prevented_redundant_path_selection = true;
@@ -1477,8 +1470,8 @@ where L::Target: Logger {
                                        // Decrease the available liquidity of a hop in the middle of the path.
                                        let victim_scid = payment_path.hops[(payment_path.hops.len()) / 2].0.candidate.short_channel_id();
                                        log_trace!(logger, "Disabling channel {} for future path building iterations to avoid duplicates.", victim_scid);
-                                       available_channel_liquidities.get_mut(&(victim_scid, false)).map(|l| l.exhaust());
-                                       available_channel_liquidities.get_mut(&(victim_scid, true)).map(|l| l.exhaust());
+					*available_channel_liquidities.entry((victim_scid, false)).or_insert(u64::max_value()) = u64::max_value();
+					*available_channel_liquidities.entry((victim_scid, true)).or_insert(u64::max_value()) = u64::max_value();
                                }
 
                                // Track the total amount all our collected paths allow to send so that we:

@jkczyz
Copy link
Contributor Author

jkczyz commented May 5, 2022

I don't think the following part reproduces the same behavior. It only accounts for either used liquidity previously stored or spent_on_hop_msat. Don't we need to sum spent_on_hop_msat with what was stored (i.e., what reduce_by was doing)?

@@ -1461,11 +1456,9 @@ where L::Target: Logger {
                                for (prev_hop, (hop, _)) in prev_hop_iter.zip(payment_path.hops.iter()) {
                                        let spent_on_hop_msat = value_contribution_msat + hop.next_hops_fee_msat;
                                        let available_liquidity_msat = available_channel_liquidities
-                                               .get_mut(&(hop.candidate.short_channel_id(), *prev_hop < hop.node_id))
-                                               .unwrap()
-                                               .reduce_by(spent_on_hop_msat)
-                                               .as_msat();
-                                       if available_liquidity_msat == 0 {
+                                               .entry((hop.candidate.short_channel_id(), *prev_hop < hop.node_id))
+                                               .or_insert(spent_on_hop_msat);
+                                       if *available_liquidity_msat == hop.candidate.htlc_maximum_msat() {
                                                // If this path used all of this channel's available liquidity, we know
                                                // this path will not be selected again in the next loop iteration.
                                                prevented_redundant_path_selection = true;

@TheBlueMatt
Copy link
Collaborator

Oops, yea, you're right that hunk is slightly off. Shouldn't impact the performance at all, though, it's a cold path.

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from 8e1aeb5 to 6c7368e Compare May 5, 2022 13:16
Comment on lines 895 to 900
available_value_contribution_msat = available_value_contribution_msat
.saturating_sub(used_liquidity_msat);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok that we continue after here when available_value_contribution_msat becomes zero?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so - its no different from the subtraction in the previous line succeeding but returning 0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ie that contributes_sufficient_value will be false and we won't do anything later anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still prefer we avoid the saturating_sub in the 0 case - 0 is by far the most common and we're already branching on if we have an entry or not. Doubt it'll show up outside of noise given our benchmarks are so noisy, but the top of this macro is basically the hottest part of the whole router, so we should be as careful as we can..

@jkczyz
Copy link
Contributor Author

jkczyz commented May 5, 2022

@TheBlueMatt Care if I squash and rebase?

@TheBlueMatt
Copy link
Collaborator

Go for it!

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from 6c7368e to 37f7be0 Compare May 5, 2022 19:40
@@ -399,6 +399,16 @@ impl<'a> CandidateRouteHop<'a> {
}
}

fn htlc_maximum_msat(&self) -> u64 {
match self {
CandidateRouteHop::FirstHop { details } => details.outbound_capacity_msat,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we use this for the channel's liquidity, shouldn't this be next_outbound_htlc_limit_msat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for catching that. I had written this before that field existed and missed it on rebase.

Comment on lines 895 to 900
available_value_contribution_msat = available_value_contribution_msat
.saturating_sub(used_liquidity_msat);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still prefer we avoid the saturating_sub in the 0 case - 0 is by far the most common and we're already branching on if we have an entry or not. Doubt it'll show up outside of noise given our benchmarks are so noisy, but the top of this macro is basically the hottest part of the whole router, so we should be as careful as we can..

// bookkept_channels_liquidity_available_msat, which will improve
// the further iterations of path finding. Also don't erase first_hop_targets.
// For every new path, start from scratch, except for used_channel_liquidities, which
// will improve the further iterations of path finding, and first_hop_targets.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, "will improve the further iterations of path finding" is...nonsense? I mean it was already there but I have no idea what it means. Maybe say something like "which will avoid re-using previously-selected paths in future iterations"?

/// The amount to send through the channel, denominated in millisatoshi.
pub amount_msat: u64,

/// Total amount, denominated in millisatoshi, already allocated to send through the channel.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe clarify that this means allocated during the current routing search, ie across MPP parts? inflight_htlc_msat makes me think LDK is being super clever and tracking which payments are pending over a channel across different routing attempts.


scorer.payment_path_failed(&[], 42);
assert_eq!(scorer.channel_penalty_msat(42, 1, 1, &source, &target), 1_064);
assert_eq!(scorer.channel_penalty_msat(42, &source, &target, usage), 1_064);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about all these test changes - we went from a capacity of 1msat to a capacity of unknown and the scores don't change? I'm clearly missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing the legacy Scorer configured with 0 for overuse_penalty_msat_per_1024th, so capacity doesn't affect the penalty.

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from 37f7be0 to 6400b71 Compare May 5, 2022 21:32
@jkczyz
Copy link
Contributor Author

jkczyz commented May 5, 2022

Can't quite figure out why test_inconsistent_mpp_params is failing. Compared to before the change the first PathBuildingHop's value_contribution_msat is not reduced by the hop fees for some reason.

@TheBlueMatt
Copy link
Collaborator

If I fix #1456 (comment) and drop the last commit tests pass fine locally.

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from cbc5e4f to ef574a9 Compare May 6, 2022 18:01
@jkczyz jkczyz added this to the 0.0.107 milestone May 16, 2022
.finish()
.field("cltv_expiry_delta", &self.candidate.cltv_expiry_delta());
#[cfg(all(not(feature = "_bench_unstable"), any(test, fuzzing)))]
let debug_struct = debug_struct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of shadowing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid needing conditional compilation for debug_struct.finish(), too.

let mut bookkept_channels_liquidity_available_msat = HashMap::with_capacity(network_nodes.len());
// Keep track of how much liquidity has been used in selected channels. Used to determine
// if the channel can be used by additional MPP paths or to inform path finding decisions. It is
// aware of direction, but liquidity used in one direction will not offset any used in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's direction-aware, why would liquidity used in one direction offset that in the opposite? Or to put it simply, should the "but" be a "so?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"will not offeset". It is directionally aware only in the sense that htlc_maximum_msat may be be different for each direction. See last sentence, but do let me know if you have a better wording.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think saying it's not directionally aware sets the expectation that liquidity in one direction wouldn't offset the other direction, which is, in fact, the case. So a "so" or "hence" or something along these lines, or perhaps even "…aware of direction: liquidity used in one…" Just throwing out ideas.

if spent_on_hop_msat == *channel_liquidity_available_msat {
let prev_hop_iter = core::iter::once(&our_node_id)
.chain(payment_path.hops.iter().map(|(hop, _)| &hop.node_id));
for (prev_hop, (hop, _)) in prev_hop_iter.zip(payment_path.hops.iter()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the sake of consistency, if we have previous_hop, the other one should probably be current_hop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... prev_hop is named relative to hop. I can see either way but lean towards the shorter version for loop and closure variables.

}
if !prevented_redundant_path_selection {
// If we weren't capped by hitting a liquidity limit on a channel in the path,
// we'll probably end up picking the same path again on the next iteration.
// Decrease the available liquidity of a hop in the middle of the path.
let victim_scid = payment_path.hops[(payment_path.hops.len()) / 2].0.candidate.short_channel_id();
let exhausted = u64::max_value();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I find this kind of thing less readable than the alternative - we don't need to define a constant for a constant, not that its a big deal, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was inclined to keep the long lines given the rest of the file is already over 100 characters. Figured using a variable would save two lines from wrapping and is a bit more readable. Don't have a strong opinion.

@@ -1242,9 +1267,16 @@ where L::Target: Logger {
short_channel_id: hop.short_channel_id,
})
.unwrap_or_else(|| CandidateRouteHop::PrivateHop { hint: hop });
let capacity_msat = candidate.effective_capacity().as_msat();
let channel_usage = ChannelUsage {
amount_msat: final_value_msat,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be - aggregate_next_hops_fee?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, upstream is using final_value_msat. Looks like this was always that way?

8dc7cfa#diff-5ef67b390b7a9555f9e8dd68104b63b838b886bfdb1c0fe25f242a9c36b52d06R1124

Wouldn't doing so result in passing 0 to amount_msat for the first iteration, though. This is for the route hints, so isn't final_value_msat accurate for at least the first iteration (reverse order)? Should aggregate_next_hops_fee_msat be initialized to final_value_msat? Though that wouldn't right for the hops_fee calculation below. Feel like I'm missing something here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry, should be + aggregate_next_hops-fee, the same as we do for any other scoring call. The point here is that there are sometimes multiple hops in the last-hop hints. Its not really the last-hop hints more the last-hops hint. In the inner loop here the first score call aggregate_next_hops_fee is always 0, so no change, but if we have multiple hops to get to the destination, we may have to pay a fee to the later hops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, actually missed the minus sign on your first comment. But, yeah, all makes sense now. Think it should be updated accordingly, IIUC.

let capacity_msat = candidate.effective_capacity().as_msat();
let channel_usage = ChannelUsage {
amount_msat: final_value_msat,
inflight_htlc_msat: 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we look this up? We could have used this last-hop in a previous scan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I'm a little confused about this more in general, actually. Why do we call the scorer here with this hop and then do so again with the same hop passed to add_entry! below?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do it so that we can track the aggregate penalty of this path and add it to later calls. It is redundant, yes, but we have to pass the aggregate_next_hops_path_penalty_msat in to add_entry on the next call. That said, I think the order is wrong, I believe we should be doing `add_entry above this line, not below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from 28548c4 to be7c875 Compare May 17, 2022 22:48
TheBlueMatt
TheBlueMatt previously approved these changes May 17, 2022
@TheBlueMatt
Copy link
Collaborator

Looks like this needs rebase now :(

@jkczyz jkczyz force-pushed the 2022-04-effective-capacity branch from 0c87dda to f259daf Compare May 19, 2022 19:09
TheBlueMatt
TheBlueMatt previously approved these changes May 19, 2022
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to squash!

jkczyz added 5 commits May 19, 2022 14:25
Using EffectiveCapacity in scoring gives more accurate success
probabilities when the maximum HTLC value is less than the channel
capacity. Change EffectiveCapacity to prefer the channel's capacity
over its maximum HTLC limit, but still use the latter for route finding.
When scoring route hints, the amount passed to the scorer should include
any fees needed for subsequent hops. This worked correctly for single-
hop hints since there are no further hops, but not for multi-hint hops
(except the final one).
For route hints, the aggregate next hops path penalty and CLTV delta
should be computed after considering each hop rather than before.
Otherwise, these aggregate values will include values from the current
hop, too.
Scorers could benefit from having the channel's EffectiveCapacity rather
than a u64 msat value. For instance, ProbabilisticScorer can give a more
accurate penalty when given the ExactLiquidity variant. Pass a struct
wrapping the effective capacity, the proposed amount, and any in-flight
HTLC value.
For direct channels, the channel liquidity is known with certainty. Use
this knowledge in ProbabilisticScorer by either penalizing with the
per-hop penalty or u64::max_value depending on the amount.
@arik-so
Copy link
Contributor

arik-so commented May 20, 2022

Looks good to me; I'm ready for the squashing.

@jkczyz
Copy link
Contributor Author

jkczyz commented May 20, 2022

Looks good to me; I'm ready for the squashing.

Already did!

Copy link
Contributor

@arik-so arik-so left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arik-so arik-so merged commit 0c6974b into lightningdevkit:main May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pass EffectiveCapacity Through to Scorers
5 participants