-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarks and performance #25
Comments
Added some simple benchmarks to catch regressions and measure optimisations on reading |
I was playing with the magic attribute and benchmarks and I realized it is very slow so I thought I would share: #![feature(test)]
use deku::prelude::*;
#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(endian = "big", magic = b"\x00\x00\x00\x05")]
struct WithMagicV5 {
value: u32
}
#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct WithoutMagicV5 {
#[deku(assert = "*version == 5")]
version: u32, // users ideally should not be able to set this as it is always 5
value: u32
}
#[cfg_attr(test, macro_use)]
extern crate hex_literal;
#[cfg(test)]
mod magic_benches {
use super::*;
extern crate test;
use test::Bencher;
#[bench]
fn bench_with_magic(b: &mut Bencher) {
let data = hex!("00 00 00 05 00 00 00 01");
b.iter(|| {
let (_, _) = WithMagicV5::from_bytes((&data, 0)).unwrap();
});
}
#[bench]
fn bench_without_magic(b: &mut Bencher) {
let data = hex!("00 00 00 05 00 00 00 01");
b.iter(|| {
let (_, _) = WithoutMagicV5::from_bytes((&data, 0)).unwrap();
});
}
} And the results:
|
Nice one! There are some optimizations which could be done. This is a great example. For example, To add flexibility, |
Oh yeah, actually the first problem I faced with I'd be happy to give this a shot if you could give me a rough idea where to look? Seems relatively easy. |
@OliverGavin Yes! That would be awesome. Happy to show you where to start. I'd take a look at how the The magic attribute is declared here: https://github.com/sharksforarms/deku/blob/master/deku-derive/src/lib.rs#L121 and as you can see only accepts a I'll create an issue for this! Edit: #204 |
Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25
Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25
Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25
Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25
Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25
Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See #44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See #25
Using criterion
+bonus if compared to other crates like
nom
for readingThe text was updated successfully, but these errors were encountered: