Benchmarks and performance #25

sharksforarms · 2020-05-28T20:46:09Z

Using criterion

+bonus if compared to other crates like nom for reading

The text was updated successfully, but these errors were encountered:

sharksforarms · 2020-06-01T02:43:15Z

Added some simple benchmarks to catch regressions and measure optimisations on reading

OliverGavin · 2021-04-17T16:39:44Z

I was playing with the magic attribute and benchmarks and I realized it is very slow so I thought I would share:

#![feature(test)]

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(endian = "big", magic = b"\x00\x00\x00\x05")]
struct WithMagicV5 {
    value: u32
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct WithoutMagicV5 {
    #[deku(assert = "*version == 5")]
    version: u32,  // users ideally should not be able to set this as it is always 5
    value: u32
}

#[cfg_attr(test, macro_use)]
extern crate hex_literal;

#[cfg(test)]
mod magic_benches {
    use super::*;

    extern crate test;
    use test::Bencher;
    
    #[bench]
    fn bench_with_magic(b: &mut Bencher) {
        let data = hex!("00 00 00 05 00 00 00 01");
        b.iter(|| {
            let (_, _) = WithMagicV5::from_bytes((&data, 0)).unwrap();
        });
    }
    
    #[bench]
    fn bench_without_magic(b: &mut Bencher) {
        let data = hex!("00 00 00 05 00 00 00 01");
        b.iter(|| {
            let (_, _) = WithoutMagicV5::from_bytes((&data, 0)).unwrap();
        });
    }
}

And the results:

running 2 tests
test magic_benches::bench_with_magic        ... bench:          48 ns/iter (+/- 4)
test magic_benches::bench_without_magic     ... bench:          21 ns/iter (+/- 6)

sharksforarms · 2021-04-17T17:33:11Z

I was playing with the magic attribute and benchmarks and I realized it is very slow so I thought I would share:

Nice one!

There are some optimizations which could be done. This is a great example.

For example, magic attribute handling could convert the byte string to 0x00000005u32 (at compile time?) and do a single u32 read and equality check. Currently, it would do 4 u8 reads and 4 equality checks (iterating the byte string).

To add flexibility, magic could also accept a number or a bytestring!

OliverGavin · 2021-04-17T19:58:48Z

Oh yeah, actually the first problem I faced with magic was that I couldn't pass a number (and it wasn't clear from the docs as it was a string example).

I'd be happy to give this a shot if you could give me a rough idea where to look? Seems relatively easy.

sharksforarms · 2021-04-20T16:21:20Z

@OliverGavin Yes! That would be awesome. Happy to show you where to start.

I'd take a look at how the id attribute works, this attribute accepts both b"bytes" and tokens: https://github.com/sharksforarms/deku/blob/master/deku-derive/src/lib.rs#L17

The magic attribute is declared here: https://github.com/sharksforarms/deku/blob/master/deku-derive/src/lib.rs#L121 and as you can see only accepts a LitByteStr. This would involve changing the type to an enum to accept either b"" and other desired variants (such as LitInt)

I'll create an issue for this!

Edit: #204

Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25

Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See #44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See #25

sharksforarms added good first issue and removed help wanted labels May 28, 2020

sharksforarms changed the title ~~Add benchmarks~~ Benchmarks and performance Jun 15, 2020

sharksforarms added help wanted optimization Optimization labels Jun 15, 2020

sharksforarms mentioned this issue Jul 30, 2020

Differing impls for bits and bytes #44

Closed

sharksforarms removed good first issue labels Nov 4, 2020

sharksforarms mentioned this issue Apr 20, 2021

Allow magic attribute to handle more than byte string #204

Open

wcampbell0x2a mentioned this issue Sep 28, 2022

Specialize bytes #278

Merged

wcampbell0x2a mentioned this issue Oct 28, 2022

Is deku fast? #288

Closed

wcampbell0x2a mentioned this issue Dec 27, 2023

Improve performance after impl-writer and impl-reader #393

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks and performance #25

Benchmarks and performance #25

sharksforarms commented May 28, 2020

sharksforarms commented Jun 1, 2020 •

edited

Loading

OliverGavin commented Apr 17, 2021

sharksforarms commented Apr 17, 2021 •

edited

Loading

OliverGavin commented Apr 17, 2021

sharksforarms commented Apr 20, 2021 •

edited

Loading

Benchmarks and performance #25

Benchmarks and performance #25

Comments

sharksforarms commented May 28, 2020

sharksforarms commented Jun 1, 2020 • edited Loading

OliverGavin commented Apr 17, 2021

sharksforarms commented Apr 17, 2021 • edited Loading

OliverGavin commented Apr 17, 2021

sharksforarms commented Apr 20, 2021 • edited Loading

sharksforarms commented Jun 1, 2020 •

edited

Loading

sharksforarms commented Apr 17, 2021 •

edited

Loading

sharksforarms commented Apr 20, 2021 •

edited

Loading