Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: API breaking performance improvement #344

Closed
wcampbell0x2a opened this issue Jun 13, 2023 · 0 comments
Closed

RFC: API breaking performance improvement #344

wcampbell0x2a opened this issue Jun 13, 2023 · 0 comments
Labels
enhancement New feature or request optimization Optimization

Comments

@wcampbell0x2a
Copy link
Collaborator

Some thoughts on the improvement of performance within deku. read has been changed to just return the amount of bytes read. So far I only implemented the new read for u8 for some initial testing. I haven't messed around with benchmarking or implementing for other types, just wanted to see if anyone had objections before I put some time into that (when I get more time... someday ;) )

I'm assuming this could be done for write primitives also, but haven't tried that.

/// "Reader" trait: read bits and construct type
pub trait NewDekuRead<'a, Ctx = ()> {
    /// Read bits and construct type
    /// * **input** - Input as bits
    /// * **ctx** - A context required by context-sensitive reading. A unit type `()` means no context
    /// needed.
    ///
    /// Returns the remaining bits after parsing in addition to Self.
    fn read_new(
        input: &'a BitSlice<u8, Msb0>,
        ctx: Ctx,
    ) -> Result<(usize, Self), DekuError>
    where
        Self: Sized;
}
/// "Reader" trait: implemented on DekuRead struct and enum containers. A `container` is a type which
/// doesn't need any context information.
pub trait NewDekuContainerRead<'a>: NewDekuRead<'a, ()> {
    /// Read bytes and construct type
    /// * **input** - Input given as data and bit offset
    ///
    /// Returns the remaining bytes and bit offset after parsing in addition to Self.
    fn from_bytes(input: (&'a [u8], usize)) -> Result<((usize, usize), Self), DekuError>
    where
        Self: Sized;
}

// specialize u8 for ByteSize
impl NewDekuRead<'_, ()> for u8 {
    fn read_new(
        input: &BitSlice<u8, Msb0>,
        _: (),
    ) -> Result<(usize, Self), DekuError> {
        const MAX_TYPE_BITS: usize = BitSize::of::<u8>().0;
        if input.len() < MAX_TYPE_BITS {
            return Err(DekuError::Incomplete(NeedSize::new(
                MAX_TYPE_BITS,
            )));
        }

        // PANIC: We already check that input.len() < bit_size above, so no panic will happen
        let value = input[..MAX_TYPE_BITS].load::<u8>();
        Ok((MAX_TYPE_BITS, value))
    }
}

The generated code:

#[derive(Debug, DekuRead)]
pub struct Test {
    a: u8,
    b: u8,
}

impl NewDekuRead<'_, ()> for Test {
    fn read_new(
        __deku_input_bits: &'_ ::deku::bitvec::BitSlice<u8, ::deku::bitvec::Msb0>,
        _: (),
    ) -> Result<(usize, Self), ::deku::DekuError>
    {
        use core::convert::TryFrom;
        let mut __deku_read = 0;
        let mut __deku_rest = __deku_input_bits;
        let __deku_a = {
            //assert!(__deku_read < __deku_input_bits.len());
            let (__deku_read, __deku_value) =
                <u8 as NewDekuRead<'_, _>>::read_new(&__deku_rest, ())?;
            let __deku_value: u8 = Result::<_, ::deku::DekuError>::Ok(__deku_value)?;
            __deku_rest = &__deku_rest[__deku_read..];
            __deku_value
        };
        let __deku_b = {
            //assert!(__deku_read < __deku_input_bits.len());
            let (__deku_read_new, __deku_value) =
                <u8 as NewDekuRead<'_, _>>::read_new(&__deku_rest, ())?;
            let __deku_value: u8 = Result::<_, ::deku::DekuError>::Ok(__deku_value)?;
            __deku_rest = &__deku_rest[__deku_read..];
            __deku_value
        };
        let __deku_value = Self {
            a: __deku_a,
            b: __deku_b,
        };
        Ok((__deku_read, __deku_value))
    }
}

The benchmarks look pretty good, atleast for u8:

deku read new           time:   [10.862 ns 10.994 ns 11.150 ns]
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

deku                    time:   [23.186 ns 23.337 ns 23.510 ns]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

manual                  time:   [4.5398 ns 4.5656 ns 4.6002 ns]
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe

I pushed the benchmark here: https://github.com/wcampbell0x2a/deku-perf

@wcampbell0x2a wcampbell0x2a added enhancement New feature or request optimization Optimization labels Jun 13, 2023
@wcampbell0x2a wcampbell0x2a mentioned this issue Aug 3, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request optimization Optimization
Projects
None yet
Development

No branches or pull requests

1 participant