-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track Channel Liquidity History in ProbabilisticScorer #1625
Track Channel Liquidity History in ProbabilisticScorer #1625
Conversation
10921f9
to
ce5bc08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking of historical data seems like a great idea. IMO we probably need a time component to make our scoring results too dependent on the current/past throughput. Also, while the taken approach seems to be very efficient it could get tricky to reason about and could be limiting in the future. I'd therefore argue we should plan ahead and take measures to make updating the parameters (e.g., number of buckets and bucket size) easy. So far I mainly had a look at the first commit, so here are some comments regarding it.
lightning/src/routing/scoring.rs
Outdated
// Each time we update our liquidity estimate, we add 32 to the buckets for the current | ||
// min and max liquidity offset positions. | ||
// | ||
// We then decay each bucket by multiplying by 2047/2048. This ensures we can't actually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're describing here what you are doing with these magic numbers, but could you also go into more detail regarding why they are a good choice? E.g., how did you choose the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I added a note, but they're just kinda picked by playing with different options and seeing what works...
lightning/src/routing/scoring.rs
Outdated
@@ -750,6 +768,41 @@ impl<L: DerefMut<Target = u64>, T: Time, U: DerefMut<Target = T>> DirectedChanne | |||
self.set_max_liquidity_msat(max_liquidity_msat); | |||
} | |||
|
|||
fn update_history_buckets(&mut self) { | |||
// We have 8 leaky buckets for min and max liquidity. Each bucket tracks the amount of time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this called on every failed/successful payment or probe attempt, it actually doesn't seem to track the amount of time spent in each bucket, but simply a (skewed) average over the last X datapoints? Maybe this method could/should be called on a regular (real time) basis to provide i.i.d. samples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'm a bit torn on it - I imagine there's quite a large distribution of amount of times we test a channel. If we've only ever tried to send one HTLC over a channel, do we want to do something time-based where we think that this channel is always in a given state, or do we want to leave it as a single datapoint?
Ah, needs rebase now that #1617 was merged. |
8000? I thought it was 8? |
We use 8 buckets, but the exponential decay is currently set so that all current data is only completely forgotten after 8000 other datapoints (though a single individual datapoint is forgotten pretty quickly). ie if we see the liquidity of a channel in the same place 10,000 times in a row it will take 8000 data points to forget that, but if we see it only once, we will decay down to zero after only a few datapoints. |
ce5bc08
to
09a1788
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the historical buckets will adapt to rapid changes in network congestion, downgrading quickly the reliability of channel liquidity balances instead to reflect stale view. There is also the question if we could clear up the buckets based on receiving gossips or bolt4 onion messages receptions, i.e can we capture network anomalies falsifying the historical tracking ? We might also discount new channels displaying compelling routing conditions, so there is might be a bonus to introduce based on channel ages.
Overall, ACK on the direction, just wanna thinking on bounds of this approach.
// In total, this allows us to track data for the last 8,000 or so payments across a given | ||
// channel. | ||
// | ||
// These constants are a balance - we try to fit in 2 bytes per bucket to reduce overhead, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's new 32 bytes of storage per-channel (min_liquidity_offset_history
/ max_liquidity_offset_history
), if we account the ~85k of public channels, that's ~2,7 MB of storage requirement if the liquidity history feature is enabled.
Generally, do we have a storage performance budget for low-grades mobile nodes or you're thinking this feature only make sense for high payment volume nodes with pretty generous resources ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we currently have a storage budget, no, but 3MB is much less than the full graph in-memory anyway. Ultimately I think we can probably assume we can safely use 200-300MB on most systems, even most phones these days ship with 1-2GB of RAM, more for higher-end devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sounds we're large. Though would be nice to improve our user documentation about features affecting signifcantly (maybe here doesn't qualify) resource usage. Tracked in an issue : #1641
Also guide developers to have a nice view of resource usage of a Lightning node before to propose a new PR/feature.
Yea, probably need to run some numbers here and debate the constants selected, but ultimately I hope it generally doesn't. The point of having historical tracking is to avoid looking at the current state of affairs, we already have the explicit min/max tracking for that. The big question is gonna be how do we balance the penalties between the two.
Hmm, not quite sure your question here? Currently we only update scoring data based on payment/probe success/failure.
Oh, i like that. Feels like maybe it could be a separate PR/scoring penalty, though, no? |
Okay so it's more the min/max tracking in themselves to monitor if they stay accurate in function of a network anomalies. It might be still interesting in the future to have a heuristic module to capture events and apply a decaying malus on the balance liquidiities.
So I've reviewed the probabilistic scorer with the idea it incentivized node operators to maintain equilibrium of their inbound/outbound liquidity, by being proactive in their liquidity management. I wonder (in a large way) if new data received on our side like gossips could be a hint of a change in their liquidity policy that we should react in consequence.
Yep, better to advance piece by piece, once we're good with the historical tracking model. |
8f63b9f
to
116e7b9
Compare
Added a no-learning decay with a default half-life of two weeks. I believe this is now ready for review, modulo the commit message that needs filling out. |
Codecov ReportBase: 90.77% // Head: 90.72% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1625 +/- ##
==========================================
- Coverage 90.77% 90.72% -0.06%
==========================================
Files 80 87 +7
Lines 44763 46843 +2080
Branches 44763 46843 +2080
==========================================
+ Hits 40632 42496 +1864
- Misses 4131 4347 +216
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass with mostly a lot of suggestions trying to give more explanation/rationale in the comments. Will do another pass focusing on the bucket walking tomorrow.
/// | ||
/// Note that 16 or more half lives guarantees that all historical data will be wiped. | ||
/// | ||
/// Default value: 14 days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting two weeks before considering data to be out-of-date seems a long time to me, given how fast network dynamics might change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree, the point of the historical tracker is to track history, and function well when we limited data for a channel. Data from a month ago is useless in some cases, but in many cases a node that was a sink six months ago is still a sink today. Having some data is better than none. We could consider a different decay model where we decrease the penalty as time goes back rather than decaying the data itself, but I'm not sure that's really cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, all I was saying is that starting to decay data after two weeks seems like a long time to me, as we may not adapt fast enough if something changes. But, as we don't have any ground truth here, it's hard to argue either way.
Just as a note: What would probably work better than the current model of decaying all at the same rate would be to account for the velocity of behavior changes in the past by, e.g., keeping track of how often our expectation is changing on a per-node basis and decay faster the higher the deviation is. However, this would probably require keeping more data and further introduce the problem how to translate deviation into a corresponding delay. So, we're currently probably better off with the chosen approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, right, so the thinking on the decay here is that we really don't want to ever care about/use the decay here at all, but instead should rely on new data to decay out previous data. Of course if we haven't seen anything in six months we want to forget everything, but if its not been too long, we should rely on new data to slowly decay the old data, not the timer.
Just as a note: What would probably work better than the current model of decaying all at the same rate would be to account for the velocity of behavior changes in the past
Hmm, clever. I think we can do that pretty well with this bucket tracker, actually, though of course seems like something for a followup. We can determine how much the channel has been changing by calculating what % of our min,max bucket pairs are "impossible" - ie how many of our datapoints are in bucket pairs where the min capacity is larger than the max capacity.
04d0709
to
4775973
Compare
lightning/src/routing/scoring.rs
Outdated
// there are none, treat it as if we had no data. | ||
let mut is_fully_decayed = true; | ||
for (min_idx, min_bucket) in self.min_liquidity_offset_history.buckets.iter().enumerate() { | ||
if min_bucket.checked_shr(decays).unwrap_or(0) > 0 { is_fully_decayed = false; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though is basically just this checked_shr().unwrap_or(0)
command, could we DRY the decay function everywhere into its dedicated function, just to make clear what is happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Into what? Its more code to write a function to do .checked_shr.unwrap_or? I'm not quite sure I'm clear what the ask is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's more code, but then every instance of checked_shr().unwrap_or(0)
could be replaced by something much more readable. It could be even worth making a newtype providing the decay()
and is_decayed()
methods to increase readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think moving code away from where its being used makes anything more readable. Quite the opposite - obscuring what's going on by moving one line behind a newtype barrier makes things substantially more confusing IMO - now you have to jump to an unrelated part of the file to figure out what's being done when trying to read a single line. I did move the check into a local lambda, though, since its in the same place.
lightning/src/routing/scoring.rs
Outdated
|
||
let payment_amt_64th_bucket = amount_msat * 64 / self.capacity_msat; | ||
debug_assert!(payment_amt_64th_bucket <= 64); | ||
if payment_amt_64th_bucket > 64 { return res; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the combination of assert <=
and return if >
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first is a debug assertion, the second is what happens in prod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I just wasn't clear on what we can expect the value to be? Will it always be <= 64
, or won't it? What would be the condition that would make us to return here in the prod case?
Sorry if this is obvious, but I'm currently still have a hard time wrapping my head around this 64th_bucket
concept. Is it just a translation of the individual buckets to a kind of continuous u64
space in order to conduct calculations without loss of precision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it should always be <= 64
, I just didn't want to assert that/rely on it in production, because I'm not 100% confident we have no cases where a channel's value didn't change out from under us and we end up trying to forward an amount greater than the capacity. I'm probably wrong but didn't want to assert it anyway.
As for the 64th_bucket thing, yea, its definitely confusing, I don't disagree. Its less loss of precision and more that it allows us to move where the bucket "is" - we have these buckets we store points in, but a bucket spans 1/8th the total continuous space. As the top comment describes, we really want to compare the amount to different points within each bucket depending on which bucket we're in, so we map buckets into 1/64th the continuous space and do that comparison by a point within the 64th space. We could instead compare strictly with the edges of the buckets (ie treat "min" as the bottom edge of the bucket and "max" as the top edge) but its not clear to me that that's any better, and does probably overestimate probability at least somewhat within each bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thank you for the clarification. The added comment regarding each bucket moving the comparison point forward finally made me understand the 64th bucket approach. I understand the issue regarding the interval borders but wow this is kinda unintuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it was unintuitive to me until I wrote tests that weren't working right 😂
9896f67
to
d4365a4
Compare
d4365a4
to
3dd53a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more minor comments. I think this would be ready for round with a second reviewer (@jkczyz).
lightning/src/routing/scoring.rs
Outdated
|
||
let payment_amt_64th_bucket = amount_msat * 64 / self.capacity_msat; | ||
debug_assert!(payment_amt_64th_bucket <= 64); | ||
if payment_amt_64th_bucket > 64 { return res; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thank you for the clarification. The added comment regarding each bucket moving the comparison point forward finally made me understand the 64th bucket approach. I understand the issue regarding the interval borders but wow this is kinda unintuitive.
400faf3
to
0c00089
Compare
Squashed the old fixups cause there were way too many, and added one new one to rename some vars. |
0c00089
to
55eb934
Compare
55eb934
to
6c7f568
Compare
Addressed all feedback and (finally) wrote a full commit message for the new penalty commit. I think this is ready to land, pending squash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I almost get the penalty calculation lol
@@ -750,6 +810,21 @@ impl<L: DerefMut<Target = u64>, T: Time, U: DerefMut<Target = T>> DirectedChanne | |||
self.set_max_liquidity_msat(max_liquidity_msat); | |||
} | |||
|
|||
fn update_history_buckets(&mut self) { | |||
debug_assert!(*self.min_liquidity_offset_msat <= self.capacity_msat); | |||
self.min_liquidity_offset_history.track_datapoint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would seem cleaner to me to pass in the min_liquidity_offset
raw and do the bucket calculation inside track_datapoint
(similar for max_liquidity)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bucket tracker doesn't know what the capacity is, either, though. I guess we could pass both the offset and the capacity in, but that, too, feels like a layer violation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't feel like as much of a layer violation to me (since the name of the var is min_liquidity_offset_history
) but don't feel strongly
lightning/src/routing/scoring.rs
Outdated
// If we used the middle of each bucket we'd never assign any penalty at all when | ||
// sending less than 1/16th of a channel's capacity, or 1/8th if we used the top of the | ||
// bucket. | ||
let mut total_valid_points_tracked = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could invalid points be tracked? Trying to understand the use of "valid" here, it looks like we sum all points without verifying them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be points in, for example, both the max-0 and min-1 buckets, but obviously we shouldn't consider those sets of points as it implies a max below the min, which is nonsense. Its mentioned at the top of the comment - let me know how it could be made more clear.
lightning/src/routing/scoring.rs
Outdated
let mut total_valid_points_tracked = 0; | ||
for (min_idx, min_bucket) in self.min_liquidity_offset_history.buckets.iter().enumerate() { | ||
for max_bucket in self.max_liquidity_offset_history.buckets.iter().take(8 - min_idx) { | ||
total_valid_points_tracked += (*min_bucket as u64) * (*max_bucket as u64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we multiplying instead of summing? I thought each bucket was a total amount of points (mod decay)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to look at (min, max) bucket pairs, not individual buckets. For each pair, we calculate the probability of being in that state by multiplying the buckets together, then look at the buckets and the amount to determine the success probability if we're in that bucket pair.
let required_decays = self.now.duration_since(*self.last_updated).as_secs() | ||
.checked_div(params.historical_no_updates_half_life.as_secs()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how no-std
works here, will it never decay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We'll probably change that in #1752
@@ -750,6 +810,21 @@ impl<L: DerefMut<Target = u64>, T: Time, U: DerefMut<Target = T>> DirectedChanne | |||
self.set_max_liquidity_msat(max_liquidity_msat); | |||
} | |||
|
|||
fn update_history_buckets(&mut self) { | |||
debug_assert!(*self.min_liquidity_offset_msat <= self.capacity_msat); | |||
self.min_liquidity_offset_history.track_datapoint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't feel like as much of a layer violation to me (since the name of the var is min_liquidity_offset_history
) but don't feel strongly
6c7f568
to
3dce6f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to squash.
This introduces two new fields to the `ChannelLiquidity` struct - `min_liquidity_offset_history` and `max_liquidity_offset_history`, both an array of 8 `u16`s. Each entry represents the proportion of time that we spent with the min or max liquidity offset in the given 1/8th of the channel's liquidity range. ie the first bucket in `min_liquidity_offset_history` represents the proportion of time we've thought the channel's minimum liquidity is lower than 1/8th's the channel's capacity. Each bucket is stored, effectively, as a fixed-point number with 5 bits for the fractional part, which is incremented by one (ie 32) each time we update our liquidity estimates and decide our estimates are in that bucket. We then decay each bucket by 2047/2048. Thus, memory of a payment sticks around for more than 8,000 data points, though the majority of that memory decays after 1,387 data points.
Our current `ProbabilisticScorer` attempts to build a model of the current liquidity across the payment channel network. This works fine to ignore channels we *just* tried to pay through, but it fails to remember patterns over longer time horizons. Specifically, there are *many* channels within the network that are very often either fully saturated in one direction, or are regularly rebalanced and rarely saturated. While our model may discover that, when it decays its offsets or if there is a temporary change in liquidity, it effectively forgets the "normal" state of the channel. This causes substantially suboptimal routing in practice, and avoiding discarding older knowledge when new datapoints come in is a potential solution to this. Here, we implement one such design, using the decaying buckets added in the previous commit to calculate a probability of payment success based on a weighted average of recent liquidity estimates for a channel. For each min/max liquidity bucket pair (where the min liquidity is less than the max liquidity), we can calculate the probability that a payment succeeds using our traditional `amount / capacity` formula. From there, we weigh the probability by the number of points in each bucket pair, calculating a total probability for the payment, and assigning a penalty using the same log-probability calculation used for the non-historical penalties.
This simplifies adding additional half lives in DirectedChannelLiquidity by simply storing a reference to the full parameter set rather than only the single current half-life.
To avoid scoring based on incredibly old historical liquidity data, we add a new half-life here which is used to (very slowly) decay historical liquidity tracking buckets.
3dce6f8
to
c8fb859
Compare
Squashed without further changes. |
// Ensure the bucket index we pass is in the range [0, 7], even if the liquidity offset | ||
// is zero or the channel's capacity, though the second should generally never happen. | ||
(self.min_liquidity_offset_msat.saturating_sub(1) * 8 / self.capacity_msat) | ||
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored | |
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored |
// Ensure the bucket index we pass is in the range [0, 7], even if the liquidity offset | ||
// is zero or the channel's capacity, though the second should generally never happen. | ||
(self.max_liquidity_offset_msat.saturating_sub(1) * 8 / self.capacity_msat) | ||
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored | |
.try_into().unwrap_or(32)); // 32 is bogus for 8 buckets, and will be ignored |
Based on #1617, this adds two new commits which serve to track each channel's liquidity history into 8 (min and 8 max) (kinda) exponentially decaying buckets. I'm really curious to see how this performs, and anticipate it will be a useful part of the overall scorer but:
There's a number of constants in here, which I thought about, but not that hard, possible conversation topics around them include: