Vectorize QuantizeLinear, second stage of DynamicQuantizeLinear #538

robertknight · 2025-01-15T21:59:27Z

Use SIMD operations for QuantizeLinear in the f32 -> u8 case, which also forms the second stage of DynamicQuantizeLinear. This makes DQL about ~3.5x faster under AVX2.

Add SIMD operations that will be useful for implementing vectorized quantization. To enable SIMD operations to return SIMD vectors of narrower data types, needed by `saturating_cast_u8`, a "dummy" impl of `Simd` was added for arrays. This is initially limited to `[u8; N]`.

… ops This is ~3.5x faster with AVX2.

robertknight · 2025-01-16T09:05:20Z

The SimdInt::saturating_cast_u8 impls for wasm32 and Arm are simple versions for which the work is punted to the compiler. These might end up needing specialized implementations as exists for AVX2.

robertknight force-pushed the quantize-linear-simd branch 2 times, most recently from 1f39c34 to e7ee69e Compare January 16, 2025 08:52

robertknight added 3 commits January 16, 2025 08:53

Add Quantize op in rten-vecmath that does f32 -> u8 quantization

f6e547c

Use vectorized quantization in QuantizeLinear / DynamicQuantizeLinear…

39d98f2

… ops This is ~3.5x faster with AVX2.

robertknight force-pushed the quantize-linear-simd branch from e7ee69e to 39d98f2 Compare January 16, 2025 08:54

robertknight marked this pull request as ready for review January 16, 2025 09:01

robertknight merged commit 44a3a99 into main Jan 16, 2025
2 checks passed

robertknight deleted the quantize-linear-simd branch January 16, 2025 09:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize QuantizeLinear, second stage of DynamicQuantizeLinear #538

Vectorize QuantizeLinear, second stage of DynamicQuantizeLinear #538

robertknight commented Jan 15, 2025

robertknight commented Jan 16, 2025

Vectorize QuantizeLinear, second stage of DynamicQuantizeLinear #538

Vectorize QuantizeLinear, second stage of DynamicQuantizeLinear #538

Conversation

robertknight commented Jan 15, 2025

robertknight commented Jan 16, 2025