-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better Encapsulate Scoring and Cache Point Count #3188
Better Encapsulate Scoring and Cache Point Count #3188
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3188 +/- ##
==========================================
+ Coverage 89.68% 89.70% +0.01%
==========================================
Files 124 125 +1
Lines 102386 102458 +72
Branches 102386 102458 +72
==========================================
+ Hits 91827 91912 +85
+ Misses 7857 7843 -14
- Partials 2702 2703 +1 ☔ View full report in Codecov by Sentry. |
524caf3
to
74bee94
Compare
@@ -0,0 +1,499 @@ | |||
//! Approximation of log_10 using a lookup table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you feel like log approximations might wanna go in util, or will we only ever use them for scoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we'd ever use them elsewhere, and, really, the util
module is an annoying grab-bag of random stuff that should be avoided where possible IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM
9216b85
to
7798c81
Compare
LMK if I should squash. |
Yup, feel free to squash. |
7798c81
to
6680b97
Compare
Squashed without further changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a new file that was just introduced in the previous commit, is this even worth a separate commit?
Also note that the implication is that the prior commit would fail CI in isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Github didn't show what this is in reference to, but I assume its the rustfmt
commit. Yes, its absolutely worth it, because with it you can verify the move-only by doing git show --color-moved
without having to check anything.
@@ -1663,6 +1654,29 @@ mod bucketed_history { | |||
} | |||
} | |||
|
|||
pub(super) fn has_datapoints(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't the introduction of these methods be split into a separate commit from the bucket isolation refactor component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, sure, but that commit is only +85, its not exactly huge. Would you like me to change it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, it's fine
6680b97
to
a5760c8
Compare
Rewrote the last commit message but didn't change the diff at all. |
Rebased |
a5760c8
to
342e955
Compare
In the coming commits we'll isolate historical bucket logic slightly further, allowing us to cache some state. This is the first step towards that, storing the historical liquidity information in a new `HistoricalLiquidityTracker` rather than in the general `ChannelLiquidity`.
In a comming commit we'll cache some additional data in the historical bucket tracker. In order to do so, here we isolate the buckets themselves into the `bucketed_history` module, reducing the possibility of accidentally updating them directly without updating caches.
Rather than storing the two direction's buckets in `HistoricalMinMaxBuckets` (renamed `DirectedHistoricalLiquidityTracker`), we store a single reference to the `HistoricalLiquidityTracker` as well as the direction bool. This will allow us in the next commit to reference fields in the `HistoricalLiquidityTracker` aside from the two directions.
When we go to score a channel using the historical liquidity data, the first thing we do is step through all the valid bucket combinations, multiply the min and max bucket, and then add them together to calculate the total number of points tracked. This isn't a free operation, and for scorers without much data it represents a large part of the total time spent scoring during routefinding. Thus, here we cache this value, updating it every time the buckets are updated.
During routing, the majority of our time is spent in the scorer. Given the scorer isn't actually doing all that much computation, this means we're quite sensitive to memory latency. Thus, the cache lines our data sits on is incredibly important. Here, we manually lay out the `ChannelLiquidity` and `HistoricalLiquidityTracker` structs to ensure that we can do the non-historical scoring and skip historical scoring for channels with insufficient data by just looking at the same cache line the channel's SCID is on. Sadly, to do the full historical scoring we need to load a second 128-byte cache line pair, but we have some time to get there. We might consider issuing a preload instruction in the future. This improves performance a few percent.
342e955
to
4fd07ec
Compare
When we go to score a channel using the historical liquidity data,
the first thing we do is step through all the valid bucket
combinations, multiply the min and max bucket, and then add them
together to calculate the total number of points tracked. This
isn't a free operation, and for scorers without much data it
represents a large part of the total time spent scoring during
routefinding.
Thus, here we cache this value, updating it every time the buckets
are updated. In order to do so, we first have to clean up
scorer.rs
, which has been needed for some time anyway, improving encapsulation so that caching is more reasonable.