-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split version ranges into their own crate #262
Conversation
96384b2
to
8fd898b
Compare
Splitting ranges sounds like a good idea to me. Also agree that the |
This is a really good idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good, after we bike shed naming.
version-ranges/Cargo.toml
Outdated
[dependencies] | ||
proptest = { version = "1.5.0", optional = true } | ||
serde = { version = "1.0.210", features = ["derive"], optional = true } | ||
smallvec = { version = "1.13.2" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes us to using the smallvec
crate, instead of the one we maintain. This might be reasonable decision, it hasn't had a CVE in a while, but we should call it out and make sure were in agreement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, actually minimizing unsafe
usage is something I valued! It would be unfortunate to regress on that aspect. I think I’d even prefer that we published our smallvec impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer removing our custom implementation and depending on smallvec, one of the most used packages on crates.io. I expect most pubgrub users to already depend on smallvec, uv and cargo do through multiple packages.
While I see the point about unsafe and publishing a safe competitor would be nice, smallvec gives us flexibility and definitive performance in a well vetted library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s not like it’s a huge thing. It’s purpose built 200 lines. And as I recall last time, I think it was even better performing that the smallvec crate for our very specific use case. It’s not about publishing a safe competitor. I only suggested publishing it to be able to reuse it in both the range crate and in the pubgrub crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my benchmark, smallvec is about 50ms faster than our custom solution. In this intentionally pubgrub-heavy benchmark, I also tried smallvec with size 4 for comparison (uv-branch-4
).
$ taskset -c 0-1 hyperfine --warmup 1 --runs 40 "./uv-branch pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline" "./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline" "./uv-branch-4 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline"
Benchmark 1: ./uv-branch pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.842 s ± 0.006 s [User: 4.016 s, System: 0.369 s]
Range (min … max): 3.833 s … 3.854 s 40 runs
Benchmark 2: ./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.900 s ± 0.007 s [User: 4.079 s, System: 0.365 s]
Range (min … max): 3.889 s … 3.916 s 40 runs
Benchmark 3: ./uv-branch-4 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.868 s ± 0.010 s [User: 4.046 s, System: 0.372 s]
Range (min … max): 3.852 s … 3.900 s 40 runs
Summary
./uv-branch pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline ran
1.01 ± 0.00 times faster than ./uv-branch-4 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
1.02 ± 0.00 times faster than ./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Reducing the smallvec to a single free segment with SmallVec<[Interval<V>; 1]>
improves performance even further.
$ taskset -c 0-1 hyperfine --warmup 1 --runs 3 "./uv-branch-1 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline" "./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline"
Benchmark 1: ./uv-branch-1 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.774 s ± 0.007 s [User: 3.970 s, System: 0.337 s]
Range (min … max): 3.769 s … 3.782 s 3 runs
Benchmark 2: ./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.823 s ± 0.004 s [User: 4.008 s, System: 0.345 s]
Range (min … max): 3.821 s … 3.828 s 3 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
./uv-branch-1 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline ran
1.01 ± 0.00 times faster than ./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Smaller benchmarks are too noisy unfortunately.
$ taskset -c 0-1 hyperfine --warmup 1 --runs 200 "./uv-branch-1 pip compile scripts/requirements/jupyter.in --offline" "./uv-main pip compile scripts/requirements/jupyter.in --offline"
Benchmark 1: ./uv-branch-1 pip compile scripts/requirements/jupyter.in --offline
Time (mean ± σ): 21.3 ms ± 0.5 ms [User: 14.9 ms, System: 15.9 ms]
Range (min … max): 20.2 ms … 23.8 ms 200 runs
Benchmark 2: ./uv-main pip compile scripts/requirements/jupyter.in --offline
Time (mean ± σ): 21.2 ms ± 0.4 ms [User: 14.7 ms, System: 16.0 ms]
Range (min … max): 20.1 ms … 22.5 ms 200 runs
Summary
./uv-main pip compile scripts/requirements/jupyter.in --offline ran
1.00 ± 0.03 times faster than ./uv-branch-1 pip compile scripts/requirements/jupyter.in --offline
$ taskset -c 0-1 hyperfine --warmup 1 --runs 30 "./uv-branch-1 pip compile scripts/requirements/airflow.in --offline" "./uv-main pip compile scripts/requirements/airflow.in --offline"
Benchmark 1: ./uv-branch-1 pip compile scripts/requirements/airflow.in --offline
Time (mean ± σ): 263.1 ms ± 4.9 ms [User: 269.6 ms, System: 130.9 ms]
Range (min … max): 256.9 ms … 273.3 ms 30 runs
Benchmark 2: ./uv-main pip compile scripts/requirements/airflow.in --offline
Time (mean ± σ): 266.2 ms ± 6.2 ms [User: 272.5 ms, System: 132.3 ms]
Range (min … max): 256.7 ms … 285.6 ms 30 runs
Summary
./uv-branch-1 pip compile scripts/requirements/airflow.in --offline ran
1.01 ± 0.03 times faster than ./uv-main pip compile scripts/requirements/airflow.in --offline
Treating smallvec as a worse vec with SmallVec<[Interval<V>; 0]>
regresses:
$ taskset -c 0-1 hyperfine --warmup 1 --runs 3 "./uv-branch-0 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline" "./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline"
Benchmark 1: ./uv-branch-0 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.870 s ± 0.001 s [User: 4.085 s, System: 0.393 s]
Range (min … max): 3.868 s … 3.871 s 3 runs
Benchmark 2: ./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
Time (mean ± σ): 3.826 s ± 0.010 s [User: 4.006 s, System: 0.349 s]
Range (min … max): 3.819 s … 3.837 s 3 runs
Summary
./uv-main pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline ran
1.01 ± 0.00 times faster than ./uv-branch-0 pip compile scripts/requirements/bio_embeddings.in --exclude-newer 2024-03-01 --offline
The problem seems to be that the ranges type is too large:
Base types:
- uv's
Version
: 8 bytes (Arc
) Bound<Version>
: 16 bytesInterval<Version>
: 32 bytes
Ranges:
Range<Version>
main: 64 bytesRange<Version>
smallvec 2 entries: 72 bytesRange<Version>
smallvec 1 entry: 40 bytesRange<Version>
smallvec 0 entries: 24 bytes (the size of a regular Vec)
For comparison smallvec::SmallVec<[u32; 2]>>
is 24 bytes (the size of a regular Vec).
Version
is an Arc<VersionInner>
, so it has only one niche (see e.g. https://stackoverflow.com/questions/76429517/how-does-niche-optimization-for-an-enum-work-in-rust and https://www.0xatticus.com/posts/understanding_rust_niche/), while bound needs a niche for Unbounded
, but also a bit for included vs. excluded. Speculating a bit, if we use 1 bit for unbounded, 1 bit for included vs. excluded, 64 bits per Arc, then an Interval<Version>
would need 132 bits in 17 bytes or two intervals would need 264 bits in 33 bytes, each plus padding (afaik usually to the next 8 bytes).
Practically, we can win 8 bytes by manually inlining (Bound<T>, Bound<T>)
:
pub enum BoundBound<T> {
IncludedIncluded(T, T),
IncludedExcluded(T, T),
IncludedUnbounded(T),
ExcludedIncluded(T, T),
ExcludedExcluded(T, T),
ExcludedUnbounded(T),
UnboundedIncluded(T),
UnboundedExcluded(T),
UnboundedUnbounded,
}
- main
SmallVec<Interval<Version>>
: 64 bytes - main
SmallVec<BoundBound<Version>>
: 48 bytes - smallvec
SmallVec<[Interval<V>; 2]>
: 72 bytes - smallvec
SmallVec<[BoundBound<V>; 2]>
: 56 bytes
I propose switching to smallvec::SmallVec<[Interval<V>; 1]>
for a performance improvement. There are other potential performance wins by optimizing Interval<T>
or changing the smallvec size (or switching to a vec) in other places, namely unit_propagation_buffer
, merged_dependencies
and dated_derivations
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this very thoughtful and complete analysis! My personal position is that I prefer not having unsafe
for less than 4% or 5% improvement (1.3% here according to your analysis). But if that is the preference of the majority, so be it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, there are different tiers to unsafe:
- FFI unsafe: We have some unsafe in uv to interact with system APIs written in C, or in pyo3 to speak to the Python C API.
- Unsafe in high level projects
- Unsafe in std and foundational libraries
There is std and a number of projects such as serde, bytemuck and smallvec that provide foundational abstractions that can only be implemented in a generic, performant and ergonomic way using unsafe, such as Vec
or SmallVec
. Most of these would be part of std in a language such as Python or implemented as a C extension (such as numpy), but Rust allows them to live outside std.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The few percent in performance difference does not justify using the unsafe one. But, The only way to have Ranges
as its own crate is for it to depend on some smallvec crate. Given that I'd rather depend on someone else's then publish and maintain our own.
> This feature requires Rust 1.49. > When the union feature is enabled smallvec will track its state (inline or spilled) without the use of an enum tag, reducing the size of the smallvec by one machine word. This means that there is potentially no space overhead compared to Vec. Note that smallvec can still be larger than Vec if the inline buffer is larger than two machine words.
@@ -0,0 +1,7 @@ | |||
# This file is automatically @generated by Cargo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this files actually needed. I think was accidentally created before version-ranges was added to the workspace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
/// Profiling in <https://github.com/pubgrub-rs/pubgrub/pull/262#discussion_r1804276278> showed | ||
/// that a single stack entry is the most efficient. This is most likely due to `Interval<V>` | ||
/// being large. | ||
segments: SmallVec<[Interval<V>; 1]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this 1
ends up in practice to depend on the size of V
, then we could make this a const generic on Ranges
. Definitely premature optimization at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smallvec 2 uses const generics 👀
Looks like we have a plan! Thanks for the work here! |
I've checked that this change doesn't break uv |
See pubgrub-rs/pubgrub#262 for prior discussion. In most parts of uv, we don't want to depend on pubgrub, but we want to use its powerful version ranges arithmetic. pubgrub-rs/pubgrub#262 split out the ranges into a separate, small crate (that can be published separately to crates.io). With this first in a series of changes, we only depend on pubgrub in the uv-resolver crate. Note that the name is now `Ranges`, since the type formerly known as `Range` was a misnormer, it contains multiple ranges.
See pubgrub-rs/pubgrub#262 for prior discussion. In most parts of uv, we don't want to depend on pubgrub, but we want to use its powerful version ranges arithmetic. pubgrub-rs/pubgrub#262 split out the ranges into a separate, small crate (that can be published separately to crates.io). With this first in a series of changes, we only depend on pubgrub in the uv-resolver crate. Note that the name is now `Ranges`, since the type formerly known as `Range` was a misnormer, it contains multiple ranges.
Enabled by #8667 and pubgrub-rs/pubgrub#262, we can remove the uv-pubgrub crate and move the conversion between pep440 specifiers and version ranges directly into pep440. In a next step, we can remove the `VersionRangesSpecifier` intermediary and perform the conversion directly.
I've released version-ranges to crates.io and gave to the team publish permissions. The crates is now used in the PEP 440 and PEP 508 crates (https://crates.io/crates/version-ranges/reverse_dependencies) and did a neat trim in uv. I'll write an announcement post later. |
In our implementation of PEP 440 and PEP 508 (Python standards for version selectors and requirements and their markers, which again use versions), we want to use PubGrub's advanced
Range
type for its fast operations, but we don't want to depend on the whole pubgrub crate in crates that don't do resolution. Hence, I'm splitting out the version ranges into their own crate.My plan is to merge this PR, publish the crate as 0.1.0, adopt it in our use case (pep440-rs, pep508-rs, uv, downstream users of those crates), and if no blockers show up, publish a version-range[s] 1.0.0 soon.
Range
is renamed toRanges
for clarity. To avoid breaking users instantly, we reportRanges
asRange
, too.This PR also switches to
SmallVec<[Interval<V>; 1]>
for better performance over the custom smallvec or other size.The proptest feature grew out of necessity for the pubgrub tests, but i think it's a nice feature actually.
Another wishlist feature is to have a different
Display
for uv, but that didn't fit in here yet.