Heap-backed fixed-width integers #58

tarcieri · 2022-01-16T18:46:33Z

Somewhat in the spirit of #29, I have been wondering if it would make sense to support a heap-backed integer type (cc @elichi)

The idea would be that integers are still fixed-width and would otherwise work the same as the const generic counterparts, but the width could be chosen at runtime/initialization time, rather than at compile time. So they're not really "arbitrary precision" in that they don't grow/shrink, they're still fixed-width, just fixed at a width determined at runtime.

One place where this would be applicable is the rsa crate. While it'd be nice for microcontroller users to be able to instantiate RSA with e.g. 2048-bit or 4096-bit keys using only the stack, other use cases like OpenPGP implementations may need to support arbitrary key sizes running upwards of 16384-bit. For these use cases where key sizes can potentially vary wildly, it'd be nice to choose the integer size at runtime. The Integer and other traits in this crate would permit accepting either. (cc @dignifiedquire)

Internally this would involve implementing core algorithms that operate over slices instead of fixed-size arrays. This could have other benefits: it could reduce the monomorphization penalty for different const generic sizes by having them pass a slice to a function which can operate over many sizes.

There are some potential drawbacks: we don't presently unroll loops, but in some cases it'd be nice and generate better code, and this could potentially negatively impact such code. I think the solution there would be to duplicate some of that loop unrolled code between the stack and heap-backed implementations, so the stack-allocated version always unrolls the loops ahead-of-time, and the heap-backed version would need to loop over a slice.

The text was updated successfully, but these errors were encountered:

tarcieri · 2022-10-11T20:18:54Z

Another downside I forgot to mention:

The current approach of defining all operations on a newtype for [Limb; LIMBS] allows the compiler to optimize away bounds checks in certain cases because it knows a priori that accesses on literal indexes or indexes whose range can be deduced at compile time will always be in-range.

Implementing operations in terms of &[Limb] could preclude these optimizations, although using #[inline(always)] may be sufficient to ensure they're always applied if possible.

dignifiedquire · 2022-10-13T12:51:38Z

one trick I have seen is delegsting to the const versions on the dynamic versions as soon as the size is known.

newpavlov · 2022-10-13T13:01:29Z

I don't think we should allow selection of width at runtime. I think rsa would benefit from being generic over U16384 and Box<U16384>, but the width should be explicit in type signatures. Applications which want to support arbitrary key sizes would instantiate several algorithms (e.g. Rsa<U2048>, Rsa<U4096>, Rsa<Box<U8192>>, Rsa<Box<U16384>>) and would switch between them depending on provided key width.

tarcieri · 2022-10-13T13:18:22Z

@newpavlov U16384 would make things unacceptably slow for typical RSA key sizes.

Something like Rsa<U2048>, Rsa<U4096>, Rsa<Box<U8192>>, Rsa<Box<U16384>> would monomorphize the code 4 times.

newpavlov · 2022-10-13T13:26:05Z

U16384 would make things unacceptably slow for typical RSA key sizes.

For typical sizes application would use Rsa<U2048> or Rsa<U4096>.

would monomorphize the code 4 times

Yes, as intended. Compiler would be able to perform various optimizations dependent on knowing width at compile time such as unrolling and using SIMD instructions. Binary size would somewhat suffer, but I don't think it's the main concern for applications which want to support RSA keys up to 16384 bits. Application authors would be able to trade between binary sizes and potential runtime overheads by selecting number of instantiated algorithms.

burdges · 2022-10-13T13:39:56Z

I'd expect the compiler optimizes away bounds checks if you use say T: Borrow<[Limb]> and T = [Limb; LIMBS], and somehow use [Limb]: ToOwned, but then &[Limbs] works too, so you should try replacing the LIMBS parameter by a T parameter.

As a rule, a Borrow-like type parameter should generally be preferred over type level sizes. It's maybe problematic that ToOwned::Owned: Borrow<Self> but not BorrowMut<Self>, not quite sure.

It's possible RSA2048 winds up faster as [Limbs] under heavy CPU cache loads anyways, but hopefully LLVM knows when to treat [Limbs; LIMBS] as [Limbs].

newpavlov · 2022-10-13T13:50:36Z

As a rule, a Borrow-like type parameter should generally be preferred over type level sizes.

You reasoning being? The only advantage I see is smaller binary size in cases with arbitrary sized keys, but performance-wise your rule provides less opportunities for compiler to optimize. The most basic one: knowing size at compile time allows compiler to eliminate code for processing tails (because it can see that the size is multiple of 8/16/32/etc.) and for some functions even to completely unroll loops, thus completely eliminating any branching. LLVM is mostly smart enough to decide whether it worth to unroll loops or process data in chunks.

UPD: I misread your post, I thought you want us to allow RSA to be generic over &[Limb]. Having T be equal to [Limb; LIMBS] provides compiler all the necessary information for the described optimizations.

tarcieri · 2022-10-13T13:57:29Z

@burdges for e.g. a type-level parameter of impl blocks, this requires using a blanket impl, e.g.

impl<U: Borrow<[Limb]>, const LIMBS: usize> Mul<U> for UInt<LIMBS> { ... }

...which means any trait that accepts it as a type-level parameter can only impl a given trait once, which is rather restrictive.

burdges · 2022-10-13T14:12:13Z

Is there not a similar issue if U is generic over LIMBS?

tarcieri · 2022-10-13T14:14:31Z

In that case, it's a concrete type (UInt) that's generic over LIMBS, not a generic type parameter as part of a blanket impl.

tarcieri · 2023-04-30T16:51:02Z

I pushed up an initial take of this in #221

tarcieri mentioned this issue Oct 11, 2022

Implement modular arithmetic #130

Merged

2 tasks

tarcieri mentioned this issue Apr 30, 2023

Add BoxedUint: heap-allocated fixed-precision unsigned integers #221

Merged

tarcieri closed this as completed in #221 May 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heap-backed fixed-width integers #58

Heap-backed fixed-width integers #58

tarcieri commented Jan 16, 2022 •

edited

Loading

tarcieri commented Oct 11, 2022

dignifiedquire commented Oct 13, 2022

newpavlov commented Oct 13, 2022

tarcieri commented Oct 13, 2022

newpavlov commented Oct 13, 2022 •

edited

Loading

burdges commented Oct 13, 2022 •

edited

Loading

newpavlov commented Oct 13, 2022 •

edited

Loading

tarcieri commented Oct 13, 2022

burdges commented Oct 13, 2022

tarcieri commented Oct 13, 2022

tarcieri commented Apr 30, 2023

Heap-backed fixed-width integers #58

Heap-backed fixed-width integers #58

Comments

tarcieri commented Jan 16, 2022 • edited Loading

tarcieri commented Oct 11, 2022

dignifiedquire commented Oct 13, 2022

newpavlov commented Oct 13, 2022

tarcieri commented Oct 13, 2022

newpavlov commented Oct 13, 2022 • edited Loading

burdges commented Oct 13, 2022 • edited Loading

newpavlov commented Oct 13, 2022 • edited Loading

tarcieri commented Oct 13, 2022

burdges commented Oct 13, 2022

tarcieri commented Oct 13, 2022

tarcieri commented Apr 30, 2023

tarcieri commented Jan 16, 2022 •

edited

Loading

newpavlov commented Oct 13, 2022 •

edited

Loading

burdges commented Oct 13, 2022 •

edited

Loading

newpavlov commented Oct 13, 2022 •

edited

Loading