-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use log approximation in ProbabilisticScorer #1347
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1347 +/- ##
==========================================
+ Coverage 90.60% 90.75% +0.14%
==========================================
Files 72 72
Lines 40075 42254 +2179
==========================================
+ Hits 36310 38347 +2037
- Misses 3765 3907 +142
Continue to review full report at Codecov.
|
915dcc9
to
865d1de
Compare
lightning/src/routing/scoring.rs
Outdated
const LOG2_10: f64 = 3.322; | ||
|
||
fn log10_approx(numerator: u64, denominator: u64) -> f64 { | ||
(log2_approx(numerator) - log2_approx(denominator)) / LOG2_10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the division here by changing the constants in the lookup table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constants are only for the fractional parts, so the division would still be needed for the integer parts. Suppose I could make up a lookup table for that, too, given there would only be 64 entries. Would you prefer that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm confused why we have a lookup table for log2 and then convert to log10 instead of just having a lookup table in log10, but I haven't dug too deep into how this works to begin with. In any case, a 64-entry lookup table seems more than fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shift is used to get the exact integer part of the log2
(i.e., most-significant bit position) and the lookup table is used to approximate the fractional part. I'm not aware of a similarly efficient approximation for log10
though I haven't looked extensively into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, note that by shift I meant finding the most significant bit set. Thought that's how it was done under the hood.
Edit: Looks like it might a machine instruction. Anyhow, earlier should be s/shift/most-significant bit position
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we'd probably need two integer divisions by 10 to do something similar in log10, I guess? Still probably faster than a float divide, but not by a lot then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL this is apparently not true! Float divides can be a bit faster than int.
lightning/src/routing/scoring.rs
Outdated
#[inline] | ||
fn log2_approx(x: u64) -> f64 { | ||
let leading_zeros = x.leading_zeros(); | ||
let integer_part = (63 - leading_zeros) as f64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining this?
Maybe its not worth it, but I wonder if we should do this all in intops, given its basically in intops * 1 million already cause it only has three digits after the point in the constants. May be worth benchmarking - could be faster cause intops could be slower cause the flops cpu parts were previously entirely unused and are now unlocked, most probably its not even visible. |
I was playing with the If we make those integers as you suggest, we'd simply have and integer subtraction for the numerator and denominator look-ups followed by multiplication by |
Ah, actually not twice the size but the product of the two table sizes. So |
That still seems reasonable - there's not really much harm in having a pretty reasonably sized table here. |
865d1de
to
ac22d43
Compare
Alright, I pushed the change that removed floating-point operations by using a single look-up table of One way to think of it is as a window of a five bits over I added a test that generates the table if it helps understand how the table was generated or if it needs to be generated again using a different size. |
no-std-check/Cargo.toml
Outdated
default = ["lightning/no-std", "lightning-invoice/no-std"] | ||
|
||
[dependencies] | ||
lightning = { version = "0.0.105", path = "../lightning", default-features = false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that the version is optional when specifying the path. it might be more convenient to not specify, so you don't have to update when releasing a new version. this applies elsewhere too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder why the version is specified in other crates (e.g. lightning-invoice when referring to lightning), maybe worth removing in a followup PR.
@@ -646,20 +646,23 @@ impl<T: Time> ChannelLiquidity<T> { | |||
} | |||
|
|||
impl<L: Deref<Target = u64>, T: Time, U: Deref<Target = T>> DirectedChannelLiquidity<L, T, U> { | |||
/// Returns the success probability of routing the given HTLC `amount_msat` through the channel | |||
/// in this direction. | |||
fn success_probability(&self, amount_msat: u64) -> f64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good, this was the only floating-point use in the repo, and it's nice to be able to be able to target on platforms that don't have FP.
@@ -122,6 +122,10 @@ jobs: | |||
cargo test --verbose --color always --no-default-features --features no-std | |||
# check if there is a conflict between no-std and the default std feature | |||
cargo test --verbose --color always --features no-std | |||
# check no-std compatibility across dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, did you figure out why compiling in-crate doesn't uncover the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't unfortunately. In-crate compiling does catch anything using std
but apparently not methods unavailable on primitive types. Maybe there is some configuration that would trigger it? I was using cargo check -p lightning --no-default-features --features=no-std
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like something we should maybe report as a rustc bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, I see the compilation fail with a minimal example in Rust playground:
#![no_std]
fn main() {
1.0_f64.log10();
}
Compiling playground v0.0.1 (/playground)
error: language item required, but not found: `eh_personality`
|
= note: this can occur when a binary crate with `#![no_std]` is compiled for a target where `eh_personality` is defined in the standard library
= help: you may be able to compile for a target that doesn't need `eh_personality`, specify a target with `--target` or in `.cargo/config`
error: `#[panic_handler]` function required, but not found
error[[E0599]](https://doc.rust-lang.org/stable/error-index.html#E0599): no method named `log10` found for type `f64` in the current scope
[--> src/main.rs:4:13
](https://play.rust-lang.org/#) |
4 | 1.0_f64.log10();
| ^^^^^ method not found in `f64`
For more information about this error, try `rustc --explain E0599`.
error: could not compile `playground` due to 3 previous errors
If I change the empty lib.rs
file in no-std-check
to use a minimal example, I only see the last error.
#![no_std]
pub fn log(x: f64) -> f64 {
x.log10()
}
So I wonder if there is something about our configuration that is causing this to not be caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps the issue is with code that is not reachable from main()
, but only from tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore my last comment, sounds like you did try it without a call from main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so it sounds like the bug is that rustc doesn't detect this as an issue if we're building for test/check instead of building for run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, rustc
does catch it when using cd no-std-check && cargo check
with the modified lib.rs
:
Checking no-std-check v0.1.0 (/Users/jkczyz/src/rust-lightning/no-std-check)
error[E0599]: no method named `log10` found for type `f64` in the current scope
--> src/lib.rs:4:7
|
4 | x.log10()
| ^^^^^ method not found in `f64`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0599`.
error: could not compile `no-std-check`
To learn more, run the command again with --verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so the issue is that some crates in the workspace are not no-std so everything gets built with std?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... moving the other crates to exclude
doesn't seem to make a difference when modifying some code to use log10
and running cargo check --no-default-features --features=no-std
. However, it does catch an added use of println
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks basically good to me, except for the obvious trivial optimization of shifting instead of dividing.
Github seems to think this is substantially slower than upstream, that doesn't feel too surprising given we're basically taking stuff that was running (probably mostly in parallel with everything else) on the FPU and replacing it with some intops and memory loads, I'm hoping compacting the table to get it more in cache and dropping the divide improves things, but either way I think we have to eat the loss. |
6a8f107
to
af99a94
Compare
Comparing this run from another PR: https://github.com/lightningdevkit/rust-lightning/runs/5469854141?check_suite_focus=true
vs a run on this PR from earlier: https://github.com/lightningdevkit/rust-lightning/runs/5467595213?check_suite_focus=true
I see that between the two this PR is slower. But there is a similar slowdown in the other scorers as well. |
In local tests it looks like the current code is maybe a slight improvement over upstream, if anything. Didn't bother trying to run the version with the divide in it. In any case, code LGTM, just needs commits squashed. |
/// Look-up table for `log10(x) * 1024` where row `i` is used for each `x` having `i` as the | ||
/// most significant bit. The next 4 bits of `x`, if applicable, are used for the second index. | ||
const LOG10_TIMES_1024: [[u16; LOWER_BITS_BOUND as usize]; BITS as usize] = [ | ||
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we could get rid of the first 3 rows, right? Cause the first row is just representing 1
, the second row 2
and 3
, the third row... I don't care if you bother with it, though, up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, though I wasn't sure if it was worth adding branching for those cases. Keeping it as it is also simplifies the indexing.
Since f64::log10 exists in std but not core, unconditionally use log approximation so --feature=no-std will compile.
To ensure no-std is honored across dependencies, add a crate depending on lightning crates supporting no-std. This should ensure any regressions are caught. Otherwise, cargo doesn't seem to catch some incompatibilities (e.g., f64::log10 unavailable in core) and seemingly across other dependencies as describe here: https://blog.dbrgn.ch/2019/12/24/testing-for-no-std-compatibility/
af99a94
to
f041a64
Compare
Since
f64::log10
exists instd
but notcore
, unconditionally use log approximation so--feature=no-std
will compile when the crate is used as a dependency. Add a crate depending onno-std
lightning crates in order to catch any regressions in CI.Fixes #1340.