Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Add WidenUpper and WidenLower SIMD intrins #80117

Merged
merged 2 commits into from
Jan 3, 2023

Conversation

radekdoulik
Copy link
Member

This improves performance in string related areas. Example code:

> wa-info -d -f Ascii.*WidenFourAsciiBytesToUtf16AndWriteToBuffer dotnet.wasm
(func corlib_System_Text_Ascii_WidenFourAsciiBytesToUtf16AndWriteToBuffer_char__uint(param $0 i32, $1 i32, $2 i32))
 local.get $0
 i32.eqz
 if
  call mini_llvmonly_throw_nullref_exception
  unreachable

 local.get $0
 local.get $1
 i32x4.splat    [SIMD]
 i16x8.extend.low.i8x16.u    [SIMD]
 v128.store64.lane 0    [SIMD]

It is also visible in the bench sample Json task, where it improves serialization times:

measurement before after
Json, non-ASCII text serialize 0.2939ms 0.2174ms
Json, small serialize 0.0274ms 0.0262ms
Json, large serialize 7.5466ms 7.0948ms

It would be enough to do zero/sign extensions instead of zero shifts, I left the shifts in place though as I am not sure whether there is a reason for that on arm64.

This improves performance in string related areas. Example code:

    > wa-info -d -f Ascii.*WidenFourAsciiBytesToUtf16AndWriteToBuffer dotnet.wasm
    (func corlib_System_Text_Ascii_WidenFourAsciiBytesToUtf16AndWriteToBuffer_char__uint(param $0 i32, $1 i32, $2 i32))
     local.get $0
     i32.eqz
     if
      call mini_llvmonly_throw_nullref_exception
      unreachable

     local.get $0
     local.get $1
     i32x4.splat    [SIMD]
     i16x8.extend.low.i8x16.u    [SIMD]
     v128.store64.lane 0    [SIMD]

It is also visible in the bench sample Json task, where it improves
serialization times:

| measurement | before | after |
|-:|-:|-:|
|         Json, non-ASCII text serialize |     0.2939ms |     0.2174ms |
|                  Json, small serialize |     0.0274ms |     0.0262ms |
|                  Json, large serialize |     7.5466ms |     7.0948ms |

It would be enough to do zero/sign extensions instead of zero shifts, I
left the shifts in place though as I am not sure whether there is
a reason for that on arm64.
@radekdoulik radekdoulik added this to the 8.0.0 milestone Jan 3, 2023
@radekdoulik radekdoulik merged commit ae2a442 into dotnet:main Jan 3, 2023
@build-analysis build-analysis bot mentioned this pull request Jan 3, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Feb 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants