Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse v2 Rapid Gossip Sync #3098

Merged
merged 2 commits into from
Jun 13, 2024
Merged

Conversation

arik-so
Copy link
Contributor

@arik-so arik-so commented Jun 5, 2024

We recently introduced v2 snapshot serialization for the Rapid Gossip Sync Server, which includes socket addresses and node features.

This PR extends the parsing mechanism to handle that 2nd version, as well as handling various instances of supplemental data that is not currently being sent by the server, but would allow the inclusion of new data without necessitating a snapshot version increase.

let node_detail_flag = pubkey_bytes[0];
let has_address_details = (node_detail_flag & (1 << 2)) > 0;
let feature_detail_marker = (node_detail_flag & (0b111 << 3)) >> 3;
let has_additional_data = (node_detail_flag & (1 << 7)) > 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we go ahead and define something for bit 6? Maybe like bits 6 and 7 are: 0b01 -> 1 byte length, 0b10 -> 2 byte length, 0b11 -> 4 byte length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the number specifying the additional bytes to be read afterwards? Seems like a neat idea.

Another consideration I think is possibly allowing the second bit, which currently has to always be set for the key parity, to not be set to specify an x-only pubkey. However, we don't really have a notion of v1.5 gossip yet, so it may be premature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vecs automatically have the CompactSize prefix which can be either u16 or u64, and is what's currently used in the additional_data read. Do you think a more granular breakdown using the 6th bit is a worthwhile improvement?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another consideration I think is possibly allowing the second bit, which currently has to always be set for the key parity, to not be set to specify an x-only pubkey. However, we don't really have a notion of v1.5 gossip yet, so it may be premature.

Yea, I mean we wouldn't be able to decode the whole thing anyway so it seems like we'd be defining something partially which we wouldn't be able to use anyway?

Vecs automatically have the CompactSize prefix which can be either u16 or u64, and is what's currently used in the additional_data read. Do you think a more granular breakdown using the 6th bit is a worthwhile improvement?

Ah, I missed that. Saving the one byte isn't worth much, indeed, though if we don't have a different use for it we might as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no better idea for how to use bit 6, I'll convert it then. You were right btw, not a single instance of a complete address removal.

@codecov-commenter
Copy link

codecov-commenter commented Jun 6, 2024

Codecov Report

Attention: Patch coverage is 85.20000% with 37 lines in your changes missing coverage. Please review.

Project coverage is 91.31%. Comparing base (2701bc5) to head (ab0c35f).
Report is 83 commits behind head on main.

Files Patch % Lines
lightning-rapid-gossip-sync/src/processing.rs 86.17% 15 Missing and 19 partials ⚠️
lightning-rapid-gossip-sync/src/lib.rs 25.00% 3 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3098      +/-   ##
==========================================
+ Coverage   89.88%   91.31%   +1.43%     
==========================================
  Files         119      121       +2     
  Lines       97551   108652   +11101     
  Branches    97551   108652   +11101     
==========================================
+ Hits        87681    99215   +11534     
+ Misses       7304     6936     -368     
+ Partials     2566     2501      -65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch from 751434b to 6befdff Compare June 6, 2024 05:35
@arik-so arik-so marked this pull request as ready for review June 10, 2024 10:20
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM but we really should have a fuzzer on the RGS applicator.

let additional_data_marker = (node_detail_flag & (0b11 << 6)) >> 6;
let key_parity = node_detail_flag & 0b_0000_0011;
pubkey_bytes[0] = key_parity;
let current_pubkey = PublicKey::from_slice(&pubkey_bytes).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be unwraping things sent by a server...

if feature_detail_marker > 0 {
if feature_detail_marker < 7 {
let feature_index = (feature_detail_marker - 1) as usize;
synthetic_node_announcement.features = default_node_features[feature_index].clone();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, please be careful about panics

@arik-so
Copy link
Contributor Author

arik-so commented Jun 10, 2024

We do though. It's astounding it hasn't caught it.

@TheBlueMatt
Copy link
Collaborator

Did you give it CPU time on this PR?

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch from dec1d25 to 8710ef8 Compare June 11, 2024 20:00
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good mod some comments.

I think it would immensely improve readability and reviewability if more (preferably all) of these magic number and bitmasks could live in adequately named consts (that possibly even could shared between server and client).

return Err(DecodeError::UnknownVersion.into());
}

read_cursor.read_exact(&mut version_prefix)?;
let version = version_prefix[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these are currently covered by read_exact erroring, but should we still future-proof theme a bit more by making this:

Suggested change
let version = version_prefix[0];
let version = version_prefix.first().ok_or(DecodeError::ShortRead.into())?;

.. and same below for other uses of []?

@@ -88,17 +97,119 @@ impl<NG: Deref<Target=NetworkGraph<L>>, L: Deref> RapidGossipSync<NG, L> where L
// backdate the applied timestamp by a week
let backdated_timestamp = latest_seen_timestamp.saturating_sub(24 * 3600 * 7);

let mut default_node_features = Vec::new();
if parse_node_details {
let default_feature_count: u8 = Readable::read(read_cursor)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be worth introducing (more) shared types (and serialization) between the server- and client implementations, e.g. for this particular Vec variant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that can be done in a followup, though, in concert with a corresponding server PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, mind opening an issue so it doesn't get lost?

let node_detail_flag = pubkey_bytes[0];
let has_address_details = (node_detail_flag & (1 << 2)) > 0;
let feature_detail_marker = (node_detail_flag & (0b111 << 3)) >> 3;
// println!("feature detail marker: {}", feature_detail_marker);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Remove commented-out code

let mut pubkey_bytes = [0u8; 33];
read_cursor.read_exact(&mut pubkey_bytes)?;
let node_detail_flag = pubkey_bytes[0];
let has_address_details = (node_detail_flag & (1 << 2)) > 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we start introduce constants for these bitmasks, similar to how we do it in features.rs? They would def. improve readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure how to do it for multibit masks with offsets.

@TheBlueMatt
Copy link
Collaborator

Did you run the fuzzer for a while with this change?

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch from c8b2cfb to 1d2991d Compare June 12, 2024 14:39
@arik-so
Copy link
Contributor Author

arik-so commented Jun 12, 2024

Yes, I've been running the fuzzer for a while and am still running it. No panics so far.

let _additional_data: Vec<u8> = Readable::read(read_cursor)?;
}

if let Some(modification) = node_modifications.remove(&node_id_1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these only applied if we have an announcement for a channel the node is in? If there's no reason it also can be a vec instead of a hashmap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid LightningError(LightningError { err: "No existing channels for node_announcement", action: IgnoreError })

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't answer the question? We can also swallow that error. This seems like a major bug - we might have a node that gets updated without it announcing a new channel and then we'll not set new address info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I could do instead is move them to a Vec and apply all elements after processing channel updates, if you prefer? It could increase the iteration count by might be easier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you disagree that not doing so would be a major bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already converted it to applying all node updates sequentially after processing the channel announcements and added unit test coverage for that. I could add an optimization to check which ones we had channels for a priori to reduce the number of iterations, but I don't think that's worth it.

@TheBlueMatt
Copy link
Collaborator

Feel free to squash the fixups, IMO.

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch from 33d7367 to a7e8a83 Compare June 12, 2024 19:16
Comment on lines 146 to 164
let mut synthetic_node_announcement = UnsignedNodeAnnouncement {
features: NodeFeatures::empty(),
timestamp: backdated_timestamp,
node_id: current_node_id,
rgb: [0, 0, 0],
alias: NodeAlias([0u8; 32]),
addresses: Vec::new(),
excess_address_data: Vec::new(),
excess_data: Vec::new(),
};

read_only_network_graph.nodes()
.get(&current_node_id)
.and_then(|node| node.announcement_info.as_ref())
.map(|info| {
synthetic_node_announcement.features = info.features().clone();
synthetic_node_announcement.rgb = info.rgb().clone();
synthetic_node_announcement.alias = info.alias().clone();
synthetic_node_announcement.addresses = info.addresses().clone();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we skip the network graph lookup/addresses/features clones/etc if we don't have feature or address details? When we're just writing a node_id for the sake of using it later in channel details this is gonna slow us down a lot. Dunno if this may cause address info to expire tho?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we want to ensure retention, we can either send reminders from the server with identical features, or we could dedicate the 6th bit flag to preserving the node information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any preferences regarding repurposing the 6th bit, @TheBlueMatt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case, I'll convert it back to just automatically parsing a Vec given the CompactSize representation we have built in anyway, and switch to using the 6th bit for address- and feature-less reminders. I think that scenario is much more likely to occur.

Comment on lines 173 to 177
let mut address_bytes = Vec::new();
address_reader.read_to_end(&mut address_bytes)?;

let mut address_cursor = io::Cursor::new(&address_bytes);
if let Ok(current_address) = Readable::read(&mut address_cursor) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we copying the address bytes from the stream into a new vec just to turn it back into a stream to read from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're sending the address length to ensure forwards compatibility, i.e. the ability to ignore address types that the client doesn't know yet. That means we must first extract the bytes, and then try to read them. Lmk if there's a better way to do that, I don't love it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FixedLengthReader::eat_remaining()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks, done!

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch from 6c69b9d to 6adcf97 Compare June 12, 2024 20:07
let mut node_modifications: Vec<UnsignedNodeAnnouncement> = Vec::new();

if parse_node_details {
let read_only_network_graph = network_graph.read_only();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that I'm grabbing a lock for the duration of parsing all the node details. Lmk if you'd rather I keep grabbing and dropping it for each applicable node detail, @TheBlueMatt

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine. It should be fast, right? What's our current bench show for applying RGS data?

Copy link
Contributor Author

@arik-so arik-so Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the latest v2 snapshot:

[225.14 ms 227.24 ms 229.51 ms]

This one has 8,000+ node modifications.

@arik-so
Copy link
Contributor Author

arik-so commented Jun 12, 2024

lmk if I should resquash, @TheBlueMatt

@TheBlueMatt
Copy link
Collaborator

Feel free, I think @tnull is asleep so might as well :)

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch 2 times, most recently from 2037df7 to acd92cf Compare June 12, 2024 23:33
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 183 to 188
log_gossip!(
self.logger,
"Failure to parse address at index {} for node ID {}",
address_index, current_node_id
);
address_reader.eat_remaining()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your indentation is off.

@arik-so arik-so force-pushed the arik/gossip-v2-parsing branch from acd92cf to ab0c35f Compare June 13, 2024 01:35
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I'm going ahead merging this as only indentation changes were added since Matt's ACK above.

@@ -88,17 +97,119 @@ impl<NG: Deref<Target=NetworkGraph<L>>, L: Deref> RapidGossipSync<NG, L> where L
// backdate the applied timestamp by a week
let backdated_timestamp = latest_seen_timestamp.saturating_sub(24 * 3600 * 7);

let mut default_node_features = Vec::new();
if parse_node_details {
let default_feature_count: u8 = Readable::read(read_cursor)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, mind opening an issue so it doesn't get lost?

@tnull tnull merged commit cc9fc65 into lightningdevkit:main Jun 13, 2024
13 of 16 checks passed
@TheBlueMatt TheBlueMatt changed the title Parse v2 gossip Parse v2 Rapid Gossip Sync Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants