Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make log optional in ProbabilisticScorer #1372

Closed
TheBlueMatt opened this issue Mar 20, 2022 · 8 comments · Fixed by #1375
Closed

Make log optional in ProbabilisticScorer #1372

TheBlueMatt opened this issue Mar 20, 2022 · 8 comments · Fixed by #1375
Assignees
Milestone

Comments

@TheBlueMatt
Copy link
Collaborator

Conversation from slack for context is pasted below, but TL;DR I think we really need to add knobs to ProbabilisticScorer to make it more general. While I'm still relatively confident it'll work reasonably well with active probing, its behavior without is pretty bad. At least in large part due to the use of the log on the success probability to linearize it - it simply doesn't penalize nearly enough unless you set the penalty ratio much, much too high, which doesn't help either as then it'll dominate fees. We could consider a different way of combining the scores, but that also sucks as it fails to allow the "larger" score to dominate, which is exactly what you want if you're looking at a path with very high fees or very high failure rates. Ultimately, the only obvious solution is to (optionally) remove the log, which gives us something more akin to the scorer we have today, which has very good results in practice.

While we're at it, we should add a base penalty per-hop option in ProbabilisticScorer.

Tagging 106 as ultimately the ProbabilisticScorer is a pretty substantial regression in terms of routing performance on nodes with less data from 105.

Matt Corallo
	I'm playing with the new ProbabilisticScorer...some initial observations:
	* It ends up assigning very low penalties - I even doubled the multiplier knob and still see most penalties below 500msat.
	* I didn't want to more-than-double the multiplier knob, given the worst case there starts to look absurd.
	* It seems to be much more willing to take more hops than the old scorer, in line with the lower penalties. This, naturally, seems to cause much higher retry counts.
	* I assume this would change with aggressive probing or high payment volume, but with me just playing around on my own node that isn't the case. (edited) 
	I think we should think about (a) having a more aggressive "default" failure prob and maybe (b) dropping the log() entirely to increase the penalty even if our failure prob shows a low number.
	another idea would be re-introducing a per-hop static cost, irrespective of the failure prob.
Jeffrey Czyz
	what sort of amounts were you sending?
Matt Corallo
	was trying to both "huge" amounts (250k to 2.5m sats) and also one small payment (39k sats)
	not a big N here, mind you, but the old scorer was really aggressive on the limiting hops stuff, and this one just isn't at all, by comparison.
	sadly most of the merchants I regularly pay only take bitpay so I dont have a lot of places I can deposit sats :( (edited) 
	that said, sometimes it works out - just paid satoshis.place 8 sats with a 4 msat fee via 7 hops on the second try (edited) 
	so there is a part of me that wonders if its "doing the right thing" just has a bad default when you dont have enough data.
Jeffrey Czyz
	hmmm... IIUC, it degrades to what c-lightning does when it doesn't have any data. Also note that the default depends on the amount you are sending. Though I suppose that it is the case for the old scorer as of 104 (i.e., because of  overuse_penalty_msat_per_1024th)
Matt Corallo
	sure, but the 104 scorer also had a per-hop static cost, now we may get ~0 costs for a bunch of channels if we have no info, no?
Jeffrey Czyz
	nope, you only get zero penalty if the amount sent is under the lower bound, which starts at zero. So initially there should always be a penalty
Matt Corallo
	sure, but it can be incredibly small, even if not exactly 0
	especially cause the log makes it non-linear, so you can get a very, very, very low penalty for quite some range
Jeffrey Czyz
	Just playing with some numbers to recall how this works. With 1,000 msat multiplier, a payment of 250k sats through a 0.01 BTC channel (25% of the capacity) would give a 125 msat penalty with no information. 50% gives 301 msats, 75% gives 602 msats, 95% gives 1301 msats, and 99% gives 2000 msats, which is also where we artificially limit. So you really need to be using a significant portion of the capacity initially to get a 500 msat penatly. Less if you doubled the multiplier. So for small portions of the channel capacity, it seems the fees will dominate. Just wondering if it would be that bad to increase the multiplier significantly. (edited) 
	-log10((1,000,000,000 + 1 - 990,000,000) / (1,000,000,000 + 1)) * 1,000
Matt Corallo
	the default multiplier is 10k, not 1k msat
Jeffrey Czyz
	ah! right
Matt Corallo
	and I think you mean 250k sat not 250k msat, above.
	also 250k sat is actually kinda high (~$100), common payments are more like 5-20$, afaiu
	so a better analysis would be 25k sats or so
	which is ~100msat in penalty, which will get dwarfed by 1sat base fees (edited) 
Jeffrey Czyz
	yeah, you're right my numbers were off there :sweat_smile:
	or at least my units :slightly_smiling_face:
	anyhow, about 10% of the capacity gives about a 500 msat penalty if I got that right
	with the 10,000 msat multiplier
Matt Corallo
	right, my point is 1m sats is a common channel size, but a common payment amount would be something like 10k sats, which, iiuc, would be a penalty around 50msat
	cause its around 1% of the channel's capacity
	maybe 25k is also common, but that's still ~100msat in penalty
Jeffrey Czyz
	yup, I'm just wondering if increasing the multiplier would suffice, though. If you try to send an amount that is large percentage of the capacity, I would expect a large penalty would be called for
Matt Corallo
	channels are not irregularly unbalanced, I'm pretty dubious the probability is really linear here.
	like sending 95% I'd expect to succeed substantially more than 5% of the time
	so a penalty of like 100 sats seems more than excessive
	which is only a 5x on top of our current penalty multiplier, which seems like it'd be reasonable for the 1-2.5% cases above
	maybe we should add a ton of knobs here and ask cash app to play with them? like option to turn on and off log, mostly.
Jeffrey Czyz
	yeah, that's fair and simple enough change given it wouldn't affect the serialization
Matt Corallo
	right, the really nice thing about the prob. scorer is we can change the scoring pretty substantially without changing what we're storing.

@TheBlueMatt TheBlueMatt added this to the 0.0.106 milestone Mar 20, 2022
@jkczyz jkczyz self-assigned this Mar 20, 2022
@jkczyz
Copy link
Contributor

jkczyz commented Mar 20, 2022

So there's a little more required to make the math work. The success probability ranges from 0.0 to 1.0 though we limit it at the lower bound to 0.01. Taking the log10 of this gives -2 for 0.01 success probability and 0 for 1.0 success probability, or 0 to 2 when multiplying by -1.

So by removing the -log10 we are back to our success probability of 0.01 to 1.0. We probably would want to use the failure probability instead (i.e., 1.0 - success_probability) such that multiplying by liquidity_penalty_multiplier_msat gives a larger penalty for lower success probabilities. Users would also need to increase liquidity_penalty_multiplier_msat as that is now the upper bound.

@TheBlueMatt
Copy link
Collaborator Author

Yea, so lets do (1-success_probability)*2*liquidity_penalty_multiplier_msat? That way we're back to the original range?

@jkczyz
Copy link
Contributor

jkczyz commented Mar 21, 2022

Yep, and multiplying liquidity_penalty_multiplier_msat through the undivided success_probability to avoid floating point division, I believe.

@renepickhardt
Copy link

renepickhardt commented Mar 23, 2022

@TheBlueMatt
sure, but it can be incredibly small, even if not exactly 0
especially cause the log makes it non-linear, so you can get a very, very, very low penalty for quite some range

this should not be true! If you use the failure probability you will compute a/c which for a \in [0,c] will be a cost between 0 and 1.

on the other side we tage the log of a number between 0 and 1 and this grows quite fast. so the negative log costs will actually be much higher than the linearized version a/c which corresponds to the failure probability as @jkczyz indicated. This is due to the fact that the log initially hase a slope of a/c but since the log is convex the slope increases faster with larger amounts.

Note that the highest cost that the negative log probabilities can create is if the amount to be sent would be c in which case we would have -log((c+1-c)/(c+1)) = -log(1/(c+1)) = log(c+1) which is exactly the entropy of a channel of capacity c given that you know where ppm and entropy lies (you know the largest channel in the network) it should be relatively easy to find a reasonable multiplier to put the cost from the uncertainty into the same order of magnitude than the cost of the routing fees. I would start with something that puts 1/c into the order of the ppm. Might be tricky though to do this globally. One idea that comes to mind is to actually learn such multiplier on a channel level.

in all of my observations long paths usually only happen when one optimizes for fees. if you make sure that even for small amounts the probabilistic features is dominant (I recall that I suggested to not take the arithmetic mean where it is difficult(but I am not sure what you actually ended up doing)) then the probabilistic model should naturally produces short paths without a static per hop cost.

All that being said and without having looked at the code again I assume the key issue is that you don't weigh the costs from the probabilistic model heavily enough in particular when it comes to small payments.

@TheBlueMatt
Copy link
Collaborator Author

TheBlueMatt commented Mar 24, 2022

this should not be true! If you use the failure probability you will compute a/c which for a \in [0,c] will be a cost between 0 and 1.

I'm really not sure what you mean here, with log you have to pick - either you have a crazy high maximum, or you have a really low common case. That's the problem. In real world tests, even a maximum penalty of 40 sats (which far dominates fees even on much lower probabilities, if they start getting high failure rates, eg for larger payments) makes the log scorer perform incredibly poorly.

@renepickhardt
Copy link

While your initial statement about the negative log producing "very very low penalties for quite some range" in comparison to the linearized failure probability is just wrong you still might end up at some right conclusions from your observations. Let me explain:

In my last comment I tried to lay out from a mathematical point using the (very much desired) convexity of the negative log probabillity why your statement about the log is wrong. I guess code and numbers are more convincing so here you go:

c = 10
print("lin\tneg log")
for a in range(c):
    print(a/c,"\t", -log((c+1-a)/(c+1)))

lin	neg log
0.0 	 -0.0
0.1 	 0.13750352374993496
0.2 	 0.2895066171949848
0.3 	 0.45943161863729726
0.4 	 0.6520766965796931
0.5 	 0.8744691179161412
0.6 	 1.137503523749935
0.7 	 1.4594316186372973
0.8 	 1.8744691179161412
0.9 	 2.4594316186372973

you can chose a different value of the capacity c and the neg log cost will always be higher than just the linear failure probability. This is even true for very small amounts. Your point is that the neg log grows more quickly which I also confirmed by sying it goes all the way to log(c+1) and is not normalized between 0 and 1 as the linearized feature would be. Btw fees are also not bound between 0 and 1 and even if you tried to normalize them they would still not be as uniformly distributed as the failure probabilities (linearized neg log prob). This is already the first hint why the actual problem at hand is the combination of features.

That being said. your observation that the log cost gets very expensive for large amounts is absolutely correct which is the convex nature of the log and it should be this way. The main issue here (as discussed in #1227 is how to properly combine both features. This is naturally hard as the distribution of fees is different then the distribution of channel capacities. Thus you end up in situation where on some channel one feature dominates and on other channels the other feature dominates. In our research we explained why choosing just a linearized uncertainty cost for all channels is a very poor idea as it tends to saturate channels fully in payment delivery planning. Of course this may not be true if you don't use optimal splitting but just some heuristic that defines a split a head of the search for candidate paths. (yes according to the selected cost function the solution of the mcf problem is the optimal one despite your wishing me to stop saying so)

----- side remarks on ----
In any case I offered you to do real world tests together and study the data to do proper feature engineering. It is really hard to discuss without seeing experimental setup & design and data and making good decisions on anecdotal observations or just based on theory (though the later can and will most certainly lead to conduct the interesting experiments more quickly). I am fully aware that I haven't been strong yet on communicating all the experiments that I have done myself either. I guess that is the cultural difference of engineers and researchers (I am not saying one is better than the other. In fact I think both can benefit best if combined which is why as a researcher I am seeking out to engineers and developers despite getting quite some heat and having trouble finding the right language). We researchers usually publish only very little of what we do and only after we have reasonably high confidence that our results and observations can be considered "secured knowledge" while it is my impression that engineers are much more pragmatic and work on the cyclic paradime of try, observe issue, fix, iterate. Anyway I am working on releasing more code to demonstrate more precisely what is going on also in feature engineering but my offer to team up for real world evaluations is still standing.
----- side remarks off ----

Back to your point about the growth of the log (which again I'd say is a feature and not a bug!) and the point of our research that one does not want to use a fully linear feature for reliability on the entire domain and that the problem is much more the feature engineering of combining fees cost and uncertainty cost properly. There is actually middle ground that goes very much into the direction that you are asking for:

In my recent release of the piece wise linearized min cost flow problem I do not use the negative log either because I linearize piece wise. Thus I use costs like n*a/c for the n-th piece of the domain. This effectively reproduces a convex cost for the uncertainty but has easy to interpret linearized terms that operate in a certain predictable range and makes them also easier to compare to fees.
The exact math and reasoning is all described in the one page of documentation and the 100 line code example over at : https://github.com/renepickhardt/mpp-splitter/blob/master/Minimal%20Linearized%20min%20cost%20flow%20example%20for%20MPP.ipynb

In my still on going / preliminary experiments it currently seems that I am ending up finding a practical way of combining both features though I don't have high enough confidence yet that this is universally true. Thus real world experiments would certainly help to support the current observations and I will repeat myself a last time. It would be very good to test this properly on real data. I believe this is the plan and suggestion with CashApp anyway and I can only offer help on building the model and testing the model.

@TheBlueMatt
Copy link
Collaborator Author

the neg log cost will always be higher than just the linear failure probability

See conversation further up in this issue - I'm perfectly capable of doing basic math, thanks, we are talking about normalizing by the max penalty (in this case multiplying the non-log by 2, given we currently have a 0.01% cap on the log penalty).

The main issue here is how to properly combine both features. This is naturally hard as the distribution of fees is different then the distribution of channel capacities.

Yep, its also why there's no one true solution here, especially in different contexts, and why we're talking about (optionally) removing the log - as different distributions of failure probabilitity scoring/addition may be useful in some contexts.

Thus you end up in situation where on some channel one feature dominates and on other channels the other feature dominates. In our research we explained why choosing just a linearized uncertainty cost for all channels is a very poor idea as it tends to saturate channels fully in payment delivery planning.

No, allowing one feature to dominate the other in different parts of the distribution is an intended and totally good behavior for most users! Indeed, it makes it hard to tune things properly, but if failure is very high or fees are very high, you totally want them to dominate the other! This is why we linearize everything to "msats willing to be paid to avoid X".

In my recent release of the piece wise linearized min cost flow problem I do not use the negative log either because I linearize piece wise. Thus I use costs like n*a/c for the n-th piece of the domain.

Awesome! Looking forward to when we eventually, sadly probably no time soon, have the time to go implement some stuff based on the fancier router you propose.

@TheBlueMatt
Copy link
Collaborator Author

TheBlueMatt commented Mar 25, 2022

To make the whole discussion way more concrete, I threw out some common scoring examples at #1375 (comment). I'm happy to re-examine any parameters you want, you can throw more table rows in the generation code below. Note the 6.25 and 12.5 bumps here are for efficiency (1/power-of-two) of calculation, but there may be clever ways to do other numbers I didn't immediately see.

import math
for prob in [0.001, 0.01, 0.05, 0.1, 0.5]:
    if prob == 0.001:
        print("\n1k-over-1m (0.1% failure rate)")
    elif prob == 0.01:
        print("\n10k-over-1m (1% failure rate, or 10k-over-1m)")
    elif prob == 0.05:
        print("\n50k-over-1m (5% failure probability)")
    elif prob == 0.1:
        print("\n10k-over-100k (10% failure probability)")
    else:
        print("\n50k-over-100k (50% failure probability)")
    print("| --- | expected failure % (Rene/+5%/+10%) | linear | log | log w/ 6.25% addn'l (10x) | log w/ 12.5% addn'l (20x) | linear w/ 0.5 per-hop |")
    print("| --- | --- | --- | --- | --- | --- | --- |")
    for hops in [3, 5, 10]:
        hop_ln = "| " + str(hops) + " hops | "
        hop_ln += str(round((1-((1-prob) ** hops))*100, 1)) + "%/"
        hop_ln += str(round((1-((1-(prob+0.05)) ** hops))*100, 1)) + "%/"
        hop_ln += str(round((1-((1-(prob+0.10)) ** hops))*100, 1)) + "% | "
        hop_ln += str(round(prob * 20 * hops, 2)) + " sats | "
        hop_ln += str(round(-math.log(max(1-prob, 0.01), 10) * 10 * hops, 2)) + " sats | "
        hop_ln += str(round(-math.log(max(1-prob - 0.0625, 0.01), 10) * 10 * hops, 2)) + " sats | "
        hop_ln += str(round(-math.log(max(1-prob - 0.125, 0.01), 10) * 20 * hops, 2)) + " sats | "
        hop_ln += str(round((prob * 20 + 0.5) * hops, 2)) + " sats |"
        print(hop_ln)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants