Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for P2P blocklists #330

Merged
merged 7 commits into from
Feb 25, 2025
Merged

Add support for P2P blocklists #330

merged 7 commits into from
Feb 25, 2025

Conversation

xobust
Copy link
Contributor

@xobust xobust commented Feb 20, 2025

This PR adds basic support for P2P plaintext blacklists .
A list is downloaded at startup and stored in a interval tree. When an incoming connection is made the IP will be checked and connection aborted before handshake.
Not sure how useful these lists are anymore but I wanted to learn some rust and contribute a feature.

Any feedback welcome :)

Todo:

  • Block outgoing connections
  • Mark peer as blocked

Future work:
Convert IP ranges to CIDR notation and use a prefix trie
Save and load blocklists from disk and periodically pull blocklist url

Add an implementation of p2p plaintext (and gz compressed) blocklists.
The list can be read from an url or from a file.
All the IP ranges are then stored in interval trees.
Block incoming peers from blocked ips.
Copy link
Owner

@ikatson ikatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left a few comments

How widely used / popular are these blocklists? My only concern is increased number of dependencies and compilation time for a very rarely used feature

@@ -716,6 +730,11 @@ impl Session {
.read_write_timeout
.unwrap_or_else(|| Duration::from_secs(10));

let incoming_ip = addr.ip();
if self.blocklist.is_blocked(&incoming_ip) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only incoming ips are checked, how about outgoing

}
}

pub async fn load_from_url(url: &str) -> Result<Self> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about treating file:// urls specially?

}

if let Some((start_ip, end_ip)) = parse_ip_range(&line) {
let range = start_ip..(increment_ip(end_ip).unwrap());
Copy link
Owner

@ikatson ikatson Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than panicing (which we should never do in librqbit), I think it's just safe to return "end_ip" as the range bound if it couldn't be incremented.
So that you can just write let range = start_ip..increment_ip(end_ip)

For both ipv4 and ipv6 those ips are either broadcast or reserved, so we shouldn't be connecting to them anyway

Ok(blocklist)
}

pub fn is_blocked(&self, ip: &IpAddr) -> bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd pass IpAddr by value. Passing references for such primitive structs just adds a bit of syntax clutter without any benefit.

I.e. you need to write at least 3 more symbols (reference in caller(s), reference in signature, dereference below in "query_point") for no reason.

match ip {
IpAddr::V4(ipv4) => {
let num = u32::from_be_bytes(ipv4.octets());
num.checked_add(1).map(|n| IpAddr::V4(Ipv4Addr::from(n)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the above comment you might just use saturating_add and remove Option in result


let is_ipv4 = line.matches('.').count() >= 6;
// Find the split point based on whether it's IPv4 or not
let split_point: usize = if is_ipv4 {
Copy link
Owner

@ikatson ikatson Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this different for ipv4 and ipv6?

why not just

if let Some(rule_name, range) = split_once(':') {
    if let Some(start, end) = range.split_once('-') {
        match (start.parse::<IpAddr>, end.parse::<IpAddr>) {
            // if same kinds, handle, otherwise ignore
        }
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While testing against some real world block list i found out that some rules contained colon ()':') in the rule name.
This would cause the rule to fail parsing.
I think we could safely ignore those rules though since they are not well formatted.

_thread: JoinHandle<()>,
}

impl TestServer {
Copy link
Owner

@ikatson ikatson Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing HTTP server mock and just test parsing streams?
Just to decrease complexity of this test a bit and remove mockito which wasn't necessary so far?

let reader = BufReader::new(reader);
let mut lines = reader.lines();
let mut ip_ranges: Vec<std::ops::Range<IpAddr>> = Vec::new();
while let Some(line) = lines.next_line().await? {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines() allocates every line. Instead you could use read_line() from AsyncBufReadExt and reuse one allocation

@xobust xobust requested a review from ikatson February 25, 2025 03:25
@xobust xobust marked this pull request as ready for review February 25, 2025 03:25
Copy link
Owner

@ikatson ikatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR! Left a couple minor comments

let path = parsed_url
.to_file_path()
.map_err(|_| anyhow::anyhow!("Failed to convert file URL to path"))?;
return Self::load_from_file(path.to_str().unwrap()).await;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than using .unwrap() here it would be better to pass &Path (or AsRef) into load_from_file

In any case let's strive not use .unwrap() in librqbit unless it's 100% clear it will never fail

let response = reqwest::get(parsed_url)
.await
.context("Failed to send request for blocklist")?;
if response.status() != 200 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think reqwests has some shortcut like .status().is_success()

@ikatson ikatson merged commit 03c6398 into ikatson:main Feb 25, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants