Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for u32/u64 functions. #582

Merged
merged 1 commit into from
Jan 25, 2025
Merged

Add benchmarks for u32/u64 functions. #582

merged 1 commit into from
Jan 25, 2025

Conversation

josephlr
Copy link
Member

This allows us to see the effect of any specicalized implementations for getrandom::u32 or getrandom::u64. As expected, on Linux (which just uses the default implementation in utils.rs) there is no change:

test bench_u32                 ... bench:         196.50 ns/iter (+/- 4.85) = 20 MB/s
test bench_u32_via_fill        ... bench:         198.25 ns/iter (+/- 1.78) = 20 MB/s
test bench_u64                 ... bench:         196.95 ns/iter (+/- 2.99) = 40 MB/s
test bench_u64_via_fill        ... bench:         197.62 ns/iter (+/- 2.24) = 40 MB/s

but when using the rdrand backend (which is specialized), there is a mesurable difference.

test bench_u32                 ... bench:          16.84 ns/iter (+/- 0.09) = 250 MB/s
test bench_u32_via_fill        ... bench:          18.40 ns/iter (+/- 0.28) = 222 MB/s
test bench_u64                 ... bench:          16.62 ns/iter (+/- 0.06) = 500 MB/s
test bench_u64_via_fill        ... bench:          17.70 ns/iter (+/- 0.08) = 470 MB/s

This allows us to see the effect of any specicalized implementations for
`getrandom::u32` or `getrandom::u64`. As expected, on Linux (which just
uses the default implementation in `utils.rs`) there is no change:
```
test bench_u32                 ... bench:         196.50 ns/iter (+/- 4.85) = 20 MB/s
test bench_u32_via_fill        ... bench:         198.25 ns/iter (+/- 1.78) = 20 MB/s
test bench_u64                 ... bench:         196.95 ns/iter (+/- 2.99) = 40 MB/s
test bench_u64_via_fill        ... bench:         197.62 ns/iter (+/- 2.24) = 40 MB/s
```
but when using the `rdrand` backend (which is specialized), there is a
mesurable difference.
```
test bench_u32                 ... bench:          16.84 ns/iter (+/- 0.09) = 250 MB/s
test bench_u32_via_fill        ... bench:          18.40 ns/iter (+/- 0.28) = 222 MB/s
test bench_u64                 ... bench:          16.62 ns/iter (+/- 0.06) = 500 MB/s
test bench_u64_via_fill        ... bench:          17.70 ns/iter (+/- 0.08) = 470 MB/s
```

Signed-off-by: Joe Richey <[email protected]>
@josephlr josephlr requested a review from newpavlov January 10, 2025 04:24
@josephlr josephlr added this to the Post 0.3 Release milestone Jan 10, 2025
Copy link
Member

@newpavlov newpavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the bench_*_via_fill benchmarks.

@josephlr
Copy link
Member Author

I don't think we need the bench_*_via_fill benchmarks.

The main reason I added those was to test the effectiveness of the getrandom::u32() and getrandom::u64() specializations we have for some backends. I wanted to know if having different implementations for u32() (say for the rdrand backend) actually made things faster.

Does it make sense to keep them for this reason? I can add some comments explaining why if you think it's necessary.

@newpavlov
Copy link
Member

Performance was not a motivation for addition of the u32/u64 functions in the first place. It was user ergonomics and conceptual "zero costness" on some platforms. In the worst case scenario we will get some stack spilling and unnecessary copies, which will cost no more than several cycles.

@newpavlov newpavlov merged commit aa96363 into master Jan 25, 2025
59 checks passed
@newpavlov newpavlov deleted the bench_u32_u64 branch January 25, 2025 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants