-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce graph sync crate #1155
Conversation
Cool. I assume you're still working on compression for the objects? |
Yes, I'm trying to build an encoder that is distinct from gzip, and just RLE, preferably with Huffman on top. |
5fa9445
to
aa71e8c
Compare
f8b6d45
to
d7ce662
Compare
Codecov Report
@@ Coverage Diff @@
## main #1155 +/- ##
==========================================
+ Coverage 90.93% 90.96% +0.03%
==========================================
Files 77 80 +3
Lines 42394 42978 +584
Branches 42394 42978 +584
==========================================
+ Hits 38552 39096 +544
- Misses 3842 3882 +40
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some initial thoughts. Is the server-side somewhere that you want review on or do you just want high-level stuff here first?
lightning-graph-sync/src/lib.rs
Outdated
{ | ||
// make sure there is precisely one leading slash | ||
let canonical_path = format!("/{}", sync_path.trim_start_matches("/")); | ||
println!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add a logger parameter and re-use our logging stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't even need a print, I had just used it for debugging, so I removed it instead.
lightning-graph-sync/src/lib.rs
Outdated
sync_host: &str, | ||
sync_port: u16, | ||
sync_path: &str, | ||
chain_source: &Option<CS>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets just drop the chain_source
, we dont have signatures so its not like we can validate anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
let synthetic_update = UnsignedChannelUpdate { | ||
chain_hash: Default::default(), | ||
short_channel_id, | ||
timestamp: current_timestamp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm, I think we need to use the update's real timestamp as otherwise we may get an update from an HTLC failure or otherwise and reject it cause the timestamp is "old".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will get back to this one in a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the timestamp only tangentially related to the time of the actual update? I thought nodes had some logic to update the timestamp of a particular update when gossiping anyway – or is the timestamp part of the signed data when a signature is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, if the timestamp inaccuracy has implications on HTLC handling, wouldn't that mean that gossip data is slightly more than just a DDoS risk vector?
lightning-graph-sync/README.md
Outdated
|
||
Specifically, our custom channel flags break down like this: | ||
|
||
| 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some significance in the spacing difference in the columns for 8
and 4
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, my IDE just decided to add it. I'll remove.
c429f9c
to
81616c2
Compare
} | ||
} | ||
}; | ||
self.update_channel_unsigned(&channel_update) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need to be atomic with the earlier read. What would happen in the following scenario?
- Form an
UnsignedChannelUpdate
from incremental update using the read-lockedNetworkGraph
above. - Apply
ChannelUpdate
from failed payment - Apply earlier formed
UnsignedChannelUpdate
Seems the incremental update would override one we just received from a failed payment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be a problem, because all the rapid sync updates are backdated a week, so any "true" update from direct gossip should result in this update being discarded.
}; | ||
|
||
let announcement_result = network_graph | ||
.update_channel_from_unsigned_announcement(&synthetic_announcement, &chain_source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really a fan of this approach of generating a synthetic update and then feeding it in, at least not if the update has some fields set to bogus values, it seems like quite the layer violation for the lightning-graph-sync
crate to have code that relies on certain very specific behavior in the NetworkGraph
(that it ignores certain fields). I'd much prefer to have a method on the NetworkGraph
that says "add channel with minimal field set of X" and use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much prefer to have a method on the
NetworkGraph
that says "add channel with minimal field set of X" and use that.
Isn't that a bit of a layer violation, too? Taking a concept from a higher layer (i.e., the incremental update) and defining it in a lower layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, doesn't feel like one? Its not crazy for NetworkGraph
to have a method that just says "manually add a new channel to the network graph, with these parameters". Its not really specific to this kind of sync mechanism, and could be used for several different sync mechanisms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I had been thinking more about gossip as the layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we kinda have to layers, the gossip message handler, and the underlying NetworkGraph
. I agree it'd be strange to put it in the gossip message handler, but we can put some of these update methods on the NetworkGraph
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll think about possible approaches to this in the new few commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would like to continue using synthetic channel updates considering their fields are either entirely specified, or inferred from the directional info, but point definitely taken wrt the announcements. I tried an approach in the latest commit, let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In whatever release we include this in, we'll need to fix the network graph data timeout logic to somehow not run until after graph sync completes on startup. I'm not sure how to do this, but it has to be in the same release as this (but can be a separate PR). Maybe open an issue to track it?
}; | ||
|
||
let chain_source: Option<Box<dyn lightning::chain::Access>> = None; | ||
let current_timestamp = SystemTime::now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're gonna get this from the update file itself, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was actually debating as to whether to include the timestamp in the update data. I'll modify the serialization accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update and reinitialize the server to have received timestamp fields tonight, and then regenerate test vectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this been addressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully is now, though I wanna refrain from pushing until my fuzz test's working.
}; | ||
|
||
let announcement_result = network_graph | ||
.update_channel_from_unsigned_announcement(&synthetic_announcement, &chain_source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much prefer to have a method on the
NetworkGraph
that says "add channel with minimal field set of X" and use that.
Isn't that a bit of a layer violation, too? Taking a concept from a higher layer (i.e., the incremental update) and defining it in a lower layer?
.duration_since(UNIX_EPOCH) | ||
.unwrap() | ||
.as_secs() as u32; | ||
let regressed_timestamp = current_timestamp - 7 * 24 * 3600; // back-date updates by one week |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make a constant and document why backdating is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
1416dbf
to
4580139
Compare
/// `timestamp: 64`: Timestamp the announcement was originally received. | ||
/// | ||
/// All other parameters as used in `UnsignedChannelAnnouncement` fields. | ||
pub fn update_channel_from_announcement_fragments(&self, short_channel_id: u64, timestamp: u64, features: ChannelFeatures, node_id_1: PublicKey, node_id_2: PublicKey) -> Result<(), LightningError>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets just call this add_channel
or something? This name is much more specific than it need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up with a different name, but add_channel
is definitely not clear given the context of when this function should be called.
let node_1 = NodeId::from_pubkey(&node_id_1); | ||
let node_2 = NodeId::from_pubkey(&node_id_2); | ||
let chan_info = ChannelInfo { | ||
features: features.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this cloned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decloned
}; | ||
|
||
let mut add_channel_to_node = | node_id: NodeId | { | ||
match nodes.entry(node_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you DRY this with the below function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, I think
let announcement_count: u32 = Readable::read(read_cursor)?; | ||
for _ in 0..announcement_count { | ||
let features = Readable::read(read_cursor)?; | ||
let short_channel_id = Readable::read(read_cursor)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we sort by SCID and just include a variable-length-int difference from the previous one? That should cut the size down a lot, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, see #1155 (comment)
for _ in 0..announcement_count { | ||
let features = Readable::read(read_cursor)?; | ||
let short_channel_id = Readable::read(read_cursor)?; | ||
let node_id_1 = Readable::read(read_cursor)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably do something to de-duplicate node_ids, no? Like maybe just do a list of node_ids to start and then put the index in that list here instead of the full node_id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, passing index using BigSize. I also explored sorting the pubkeys, but the differences are too big to be useful.
for _current_update_index in 0..update_count { | ||
// println!("update index: {}", _current_update_index); | ||
|
||
let short_channel_id: u64 = Readable::read(read_cursor)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, maybe sort them and put a difference instead of the full one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, using BigSize
let fee_base_msat: u32 = Readable::read(&mut read_cursor)?; | ||
let fee_proportional_millionths: u32 = Readable::read(&mut read_cursor)?; | ||
|
||
let has_htlc_maximum_msat: u8 = Readable::read(read_cursor)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe set a flag value instead? ie MAX_UINT64 means "no htlc-max", and the server-side can cap it down to 21 million BTC without loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this saves space, because if there is no HTLC maximum, sending a u8 takes up less space than a u64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do something interesting with BigSize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sizes are equivalent when the number of present HTLCs == 7 * absent HTLCs. If there are more present HTLCs than absent ones, using the u64::MAX-flag saves space, and otherwise, using the u8-prefix saves space. We could add a global u8 flag at the beginning specifying which encoding is used in this transmission, and set it depending on the relation of the set/unset HTLC maxima.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was based on the assumption that the vast majority of full updates would have a htlc max, but, more importantly, see #1155 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on testing, it does indeed appear that the 7:1 ratio of set/updated HTLC maxima vs removed HTLC maxima is preserved, so for now we'll just always use u64::MAX
2f1a7f8
to
625a8c5
Compare
let channel = read_only_network_graph.channels().get(&short_channel_id); | ||
let channel = if let Some(channel) = channel { | ||
channel | ||
} else { | ||
return Err(LightningError { | ||
err: "Couldn't find channel for update".to_owned(), | ||
action: ErrorAction::IgnoreError, | ||
} | ||
.into()); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, wherever you have an Option
and need to return an error for None
you can simplify with ok_or_else
and ?
, e.g.:
let channel = read_only_network_graph
.channels()
.get(&short_channel_id)
.ok_or_else(|| LightningError {
err: "Couldn't find channel for update".to_owned(),
action: ErrorAction::IgnoreError,
})?;
return Ok(()); | ||
} | ||
let non_incremental_update_presence_flag: u8 = Readable::read(read_cursor)?; | ||
let has_non_incremental_updates = (non_incremental_update_presence_flag == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing a build warning here from usage of parentheses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did fix it, but I'm low key annoyed by this. I find with = and ==, parentheses really improve legibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to sqaush and rebase on main
along with removing commented code, etc.
channel | ||
.get_directional_info(direction) | ||
.ok_or(LightningError { | ||
err: "Couldn't find previous directional data for update".to_owned(), | ||
action: ErrorAction::IgnoreError, | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move channel
to previous line and reduce indent on other lines.
/// | ||
/// In the future, we will probably start delaying the purging of unsigned updates to their | ||
/// four-week-marks, or perhaps simply trust a provided server-timestamp that marks | ||
/// data receipt, as opposed to originally signed value. TODO: TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove or complete TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, yea, that's a tough one, too - in the current design the server has to send every SCID of channels in the current graph in each update or users will just have channels that get more and more out of date and then eventually remove them. I think the only realistic solution is to track where each update came from and only time it out based on that info, sadly, with some ability for users to "convert" a graph from "server-sync" to "p2p-sync"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that needs to be solved here, but we absolutely cannot ship this PR as-is without that followup, so would be nice to see it at least in a draft PR IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll try to have one up with the fuzzing for this one shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this TODO now?
}; | ||
|
||
let chain_source: Option<Box<dyn lightning::chain::Access>> = None; | ||
let current_timestamp = SystemTime::now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this been addressed?
let retrogressed_timestamp = current_timestamp - TIMESTAMP_RETROGRESSION_DELTA; // back-date updates by one week | ||
|
||
// shadow variable for ownership release after length-restricted reads | ||
let mut read_cursor = read_cursor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand why this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good point, addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this addressed in an un-pushed commit?
let read_only_network_graph = network_graph.read_only(); | ||
let channel = read_only_network_graph | ||
.channels() | ||
.get(&short_channel_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have performance concerns with the BTreeMap
, one possibility would be to traverse the (sorted) channel updates in parallel with the tree rather than performing look-ups for each update. That would be quite an involved change though.
2a2c0d4
to
be1b49d
Compare
let directional_info = | ||
channel | ||
.get_directional_info(channel_flags) | ||
.ok_or(LightningError { | ||
err: "Couldn't find previous directional data for update".to_owned(), | ||
action: ErrorAction::IgnoreError, | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeating previous lost comment: move channel
up one line and reduce ident of following lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustfmt seems to strongly disagree with that FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol... rustfmt may disagree with itself given it did the opposite a few lines up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustfmt at it again lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ffs, this is gonna keep happening
be1b49d
to
29e7bfa
Compare
Sure, no problem. |
9ff6fe9
to
cf14708
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this now. Could you squash?
Certainly! |
nit: Please restrict git commit messages to 80-90 chars long. The title should be one line of max 80-90 chars. I generally keep mine even shorter but others let it go a bit longer. |
Same. I use 50 characters and vim looks angry if I go over. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments here and there on tests, and a few doc comments, but after this I'm good to go here.
lightning-graph-sync/Cargo.toml
Outdated
repository = "https://github.com/lightningdevkit/rust-lightning" | ||
edition = "2018" | ||
description = """ | ||
Utility to process gossip routing data from LNSync-like server (protocol name TBD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to settle on the name before we land this. Maybe we just link to the server repo (which we should migrate to lightningdevkit
org IMO)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect there are going to be multiple iterations here. My proposal is lightning-rapid-gossip-sync
or lightning-rapid-gossip-sync-client
, though I think client
is somewhat implied given what repository it's in.
lightning-graph-sync/README.md
Outdated
|
||
## Mechanism | ||
|
||
The (presumed) server sends a compressed gossip response containing gossip data. The gossip data is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should link to the server repo (which I think we should go ahead and move into the lightningdevkit org)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point
lightning-graph-sync/README.md
Outdated
sync formats arise in the future | ||
- The fourth byte is the protocol version in case our format gets updated | ||
2. Chain hash (32 bytes) | ||
3. Latest seen timestamp (`u32`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Lets move to generation time? That way we have a bit more determinism for the timestamps in the requests (ie we could fix them to be round numbers). Not a big deal either way, though, as they're already fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different data sets generated at the same time can have different latest seen timestamps. I think keeping it as is does not unnecessarily expand the attack surface, so I'd rather keep it.
lightning-graph-sync/README.md
Outdated
## Delta Calculation | ||
|
||
The way a server is meant to calculate this rapid gossip sync data is by using two data points as a | ||
reference that are meant to be provided by the client: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only have one reference now, right? In general does this section need rewritten to talk about things from a snapshotting perspective?
lightning-graph-sync/README.md
Outdated
|
||
## Performance | ||
|
||
Given the primary purpose of this utility is a faster graph sync, we thought it might be helpful to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section needs a "as of date X with network graph of size Y"
lightning-graph-sync/src/lib.rs
Outdated
assert!(after.contains("channels: [783241506229452801]")); | ||
} | ||
|
||
#[ignore] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not wholesale #[ignore]
but rather only ignore failures to find the specific file (and then reuse the --cfg=require_route_graph_test
configure flag we use to force the test to fail in CI if the file is missing).
lightning-graph-sync/src/lib.rs
Outdated
); | ||
if let Err(crate::error::GraphSyncError::IOError(io_error)) = &sync_result { | ||
let error_string = io_error.to_string(); | ||
if error_string.to_lowercase().contains("no such file or directory") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not just always panic with the error message and then include the io error in the message?
network_graph.update_channel_unsigned(&synthetic_update)?; | ||
} | ||
|
||
let input_termination_check: Result<u8, DecodeError> = Readable::read(read_cursor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow for basic extensibility by just not checking for the end and letting there be excess stuff we ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this version, no. I think for now it's safer to make sure the entire response can be parsed correctly end to end.
let network_graph = NetworkGraph::new(block_hash); | ||
|
||
let before = network_graph.to_string(); | ||
assert_eq!(before.len(), 31); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not assert that the print encoding is a specific format here (and in the other tests in this file). Asserting that it includes certain pubkeys seems fine, but not a specific character length or that pubkeys are printed with a : afterwards.
@@ -1132,6 +1142,81 @@ impl NetworkGraph { | |||
self.update_channel_from_unsigned_announcement_intern(msg, None, chain_access) | |||
} | |||
|
|||
/// Update channel from partial announcement data received via rapid gossip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rapid gossip
seems like a fine name but lets make sure we're consistent everywhere in what we call this.
bbb7e5a
to
ef712e3
Compare
|
||
[dependencies] | ||
lightning = { version = "0.0.106", path = "../lightning" } | ||
bitcoin = { version = "0.28.1", default-features = false, features = ["secp-recovery"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I dont think we (directly) need secp-recovery?
implicitly extracted from the channel announcements and updates. | ||
|
||
The data is then applied to the current network graph, artificially dated to the timestamp of the | ||
latest seen message, be it an announcement or an update, from the server's perspective. The network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minus a week, right?
## Delta Calculation | ||
|
||
The way a server is meant to calculate this rapid gossip sync data is by using the latest timestamp | ||
an change, be it either an announcement or an update, was seen. That timestamp is included in each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an change
?
|
||
## Delta Calculation | ||
|
||
The way a server is meant to calculate this rapid gossip sync data is by using the latest timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its really confusing to just say "latest timestamp" here - we mean the time we saw the latest update, not the timestamp in the update, which is what I'd expect this to refer to.
|
||
The way a server is meant to calculate this rapid gossip sync data is by using the latest timestamp | ||
an change, be it either an announcement or an update, was seen. That timestamp is included in each | ||
rapid sync message, so all the client needs to do is cache one variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no idea what you mean? Clients have to store the full graph, not just one variable?
//! The reason the rapid sync server requires a modicum of trust is that it could provide bogus | ||
//! data, though at worst, all that would result in is a fake network topology, which wouldn't | ||
//! enable the server to steal or siphon off funds. It could, however, reduce the client's privacy | ||
//! by increasing the likelihood payments will are routed via channels the server controls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/increasing the likelihood payments will are/forcing all payments to be/
//! privately calculate routes for payments, and do so much faster and earlier than requiring a full | ||
//! peer-to-peer gossip sync to complete. | ||
//! | ||
//! The reason the rapid sync server requires a modicum of trust is that it could provide bogus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/a modicum of//
//! (not least due to announcements not including any timestamps at all, but only a block height) | ||
//! but rather, it's a timestamp of when the server saw a particular message. | ||
//! | ||
//! If a particular channel update had never occurred before, the full update is sent. If a channel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragrpah seems to be missing a lot of context to me, and I think we can just drop it and the next one, we don't need that much detail in the crate docs - just add the word "differential" in the second paragraph and I think we're good.
/// | ||
/// `sync_path`: Path to the file where the gossip update data is located | ||
/// | ||
pub fn sync_network_graph_with_file_path( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in processing.rs
- we need to return the timestamp number that the user needs to store so they can use that to query for the next version.
//! | ||
//! In practice, snapshots of server data will be calculated at regular intervals, such that deltas | ||
//! are cached and needn't be calculated on the fly. All the client needs to provide to the server | ||
//! is the timestamp included in the previously received (and hence snapshotted) delta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be docs for how to use this, too - right now its just a bag of functions that says "pass in a file or something you get from a server, but you have to query the server with some number that's not exposed to you". We should define a URI scheme (I'd suggest, like, https://graphsync.lightningdevkit.org/v1/$LAST_SYNC_TIMESTAMP.bin) and exact semantics for how to query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are docs in the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the client needs to directly link to the server. Also, if we intend to run a public server that anyone can use, we should have docs in the client that tells people how to use it - just having a "how to run your own server" guide doesn't help clients who shouldn't ever need to look at the server code (if they dont want to) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the client does link to the server, which does have comments in the readme showing the request paths. I'm happy to copy the request paths section here, but given that every server instance is gonna use a different API prefix, not to mention domain, I don't know what else to say here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, given its only about 5 LoC, it'd be nice to have a full ``` comment here describing how you can sync the graph from graphsync.lightningdevkit.org. Given this is actually really simple to use, we should show people reading the docs here just how simple it is to use :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we have the example service up, sure. Should we just say that that's gonna be an example URL in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hope we can get an example service up pretty quick - we should be able to get one up in a matter of minutes, no? Lets just commit to doing it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I can't commit to responding to all the server comments within minutes; some are a bit painful to address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol nah, I meant we could take the server as-is and run it within five minutes to get an example Service up, and improving it a bit over time is just gravy :)
44190ed
to
efed7d3
Compare
pub fn sync_network_graph_with_file_path( | ||
network_graph: &network_graph::NetworkGraph, | ||
sync_path: &str, | ||
) -> Result<(u32), GraphSyncError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the return value in a single-element tuple?
@@ -57,7 +75,7 @@ pub mod processing; | |||
pub fn sync_network_graph_with_file_path( | |||
network_graph: &network_graph::NetworkGraph, | |||
sync_path: &str, | |||
) -> Result<(u32), GraphSyncError> { | |||
) -> Result<u32, GraphSyncError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document return type here, too.
//! An example server request could look as simple as the following: | ||
//! | ||
//! ```shell | ||
//! curl -o rapid_sync.lngossip https://rapidsync.lightningdevkit.org/snapshot/<last_sync_timestamp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this should be 0 the first time? We should say that.
//! use lightning::routing::network_graph::NetworkGraph; | ||
//! | ||
//! let block_hash = genesis_block(Network::Bitcoin).header.block_hash(); | ||
//! let network_graph = NetworkGraph::new(block_hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we say something about reading from disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! A few rather trivial notes which can just be addressed when squashing IMO.
let read_only_network_graph = network_graph.read_only(); | ||
let channel = read_only_network_graph | ||
.channels() | ||
.get(&short_channel_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not worth doing it here, but as a followup we should look into some kind of mutator accessor for the network graph, then we'd avoid the double-lookup here.
LGTM. I'm ready for squash, @jkczyz ? |
…wnloaded from a server.
Its use is for fast-forwarding through gossip data downloaded from a server.
Currently still in draft status, pending better code organization and tests.