Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JitDiff X64] [xtqqczze] Use Unsafe.BitCast to avoid taking address #865

Open
MihuBot opened this issue Jan 4, 2025 · 3 comments
Open

Comments

@MihuBot
Copy link
Owner

MihuBot commented Jan 4, 2025

Job completed in 14 minutes 49 seconds.
dotnet/runtime#111091

Diffs

Found 262 files with textual diffs.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 40225646
Total bytes of diff: 40225507
Total bytes of delta: -139 (-0.00 % of base)
Total relative delta: -0.77
    diff is an improvement.
    relative diff is an improvement.


Top file improvements (bytes):
        -139 : System.Private.CoreLib.dasm (-0.00 % of base)

1 total files with Code Size differences (1 improved, 0 regressed), 258 unchanged.

Top method improvements (bytes):
         -67 (-27.46 % of base) : System.Private.CoreLib.dasm - System.SpanHelpers:Fill[System.Nullable`1[int]](byref,ulong,System.Nullable`1[int]) (FullOpts)
         -58 (-42.65 % of base) : System.Private.CoreLib.dasm - System.SpanHelpers:Fill[int](byref,ulong,int) (FullOpts)
          -8 (-3.45 % of base) : System.Private.CoreLib.dasm - System.Array:LastIndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
          -4 (-2.13 % of base) : System.Private.CoreLib.dasm - System.Array:IndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
          -1 (-0.54 % of base) : System.Private.CoreLib.dasm - System.Array:IndexOf[double](double[],double,int,int):int (FullOpts)
          -1 (-0.44 % of base) : System.Private.CoreLib.dasm - System.Array:LastIndexOf[double](double[],double,int,int):int (FullOpts)

Top method improvements (percentages):
         -58 (-42.65 % of base) : System.Private.CoreLib.dasm - System.SpanHelpers:Fill[int](byref,ulong,int) (FullOpts)
         -67 (-27.46 % of base) : System.Private.CoreLib.dasm - System.SpanHelpers:Fill[System.Nullable`1[int]](byref,ulong,System.Nullable`1[int]) (FullOpts)
          -8 (-3.45 % of base) : System.Private.CoreLib.dasm - System.Array:LastIndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
          -4 (-2.13 % of base) : System.Private.CoreLib.dasm - System.Array:IndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
          -1 (-0.54 % of base) : System.Private.CoreLib.dasm - System.Array:IndexOf[double](double[],double,int,int):int (FullOpts)
          -1 (-0.44 % of base) : System.Private.CoreLib.dasm - System.Array:LastIndexOf[double](double[],double,int,int):int (FullOpts)

6 total methods with Code Size differences (6 improved, 0 regressed), 233022 unchanged.

--------------------------------------------------------------------------------

Artifacts:

@MihuBot
Copy link
Owner Author

MihuBot commented Jan 4, 2025

Top method improvements

-67 (-27.46 % of base) - System.SpanHelpers:Fill[System.Nullable`1[int]](byref,ulong,System.Nullable`1[int])
 ; Assembly listing for method System.SpanHelpers:Fill[System.Nullable`1[int]](byref,ulong,System.Nullable`1[int]) (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 2 single block inlinees; 1 inlinees without PGO data
+; 0 inlinees with PGO data; 2 single block inlinees; 2 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T00] ( 18, 38   )   byref  ->  rdi         single-def
-;  V01 arg1         [V01,T07] ( 10,  6.50)    long  ->  rsi         single-def
-;  V02 arg2         [V02,T01] ( 18, 38   )  struct ( 8) rdx         single-def <System.Nullable`1[int]>
-;  V03 loc0         [V03,T04] ( 12, 20   )    long  ->  rax        
-;  V04 loc1         [V04    ] (  2,  1   )  struct ( 8) [rbp-0x08]  do-not-enreg[SF] ld-addr-op <System.Nullable`1[int]>
-;  V05 loc2         [V05,T15] (  5,  9.50)  simd32  ->  mm0         ld-addr-op <System.Numerics.Vector`1[ubyte]>
-;  V06 loc3         [V06,T06] (  5,  9.50)   byref  ->  rdi         single-def
-;  V07 loc4         [V07,T11] (  4,  2   )    long  ->  rax        
-;  V08 loc5         [V08,T08] (  2,  4.50)    long  ->  rcx        
-;  V09 loc6         [V09,T03] (  7, 21   )    long  ->  rdx        
-;* V10 loc7         [V10    ] (  0,  0   )  simd16  ->  zero-ref    <System.Runtime.Intrinsics.Vector128`1[ubyte]>
-;  V11 loc8         [V11,T09] (  2,  4.50)    long  ->  rcx        
-;# V12 OutArgs      [V12    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;  V13 tmp1         [V13,T05] (  2, 16   )    long  ->  rax         "dup spill"
-;* V14 tmp2         [V14    ] (  0,  0   )  simd32  ->  zero-ref    ld-addr-op "NewObj constructor temp" <System.Numerics.Vector`1[ulong]>
-;* V15 tmp3         [V15    ] (  0,  0   )  simd32  ->  zero-ref   
-;* V16 tmp4         [V16    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;* V17 tmp5         [V17    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
-;* V18 tmp6         [V18    ] (  0,  0   )  simd32  ->  zero-ref    "Inlining Arg" <System.Numerics.Vector`1[ulong]>
-;  V19 tmp7         [V19,T13] (  2,  1   )   ubyte  ->  [rbp-0x08]  do-not-enreg[] "field V04.hasValue (fldOffset=0x0)" P-DEP
-;  V20 tmp8         [V20,T14] (  2,  1   )     int  ->  [rbp-0x04]  do-not-enreg[] "field V04.value (fldOffset=0x4)" P-DEP
-;  V21 cse0         [V21,T02] (  9, 36   )    long  ->   r8         "CSE #01: aggressive"
-;  V22 cse1         [V22,T10] (  5,  2.50)    long  ->  rcx         "CSE #02: moderate"
-;  V23 cse2         [V23,T12] (  3,  1.50)    long  ->  rcx         "CSE #03: moderate"
+;  V00 arg0         [V00,T00] ( 17, 37.50)   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T05] (  8,  7.50)    long  ->  rsi         single-def
+;  V02 arg2         [V02,T01] ( 17, 37.50)  struct ( 8) rdx         single-def <System.Nullable`1[int]>
+;  V03 loc0         [V03,T03] ( 12, 20.50)    long  ->  rax        
+;* V04 loc1         [V04    ] (  0,  0   )  simd32  ->  zero-ref    ld-addr-op <System.Numerics.Vector`1[ubyte]>
+;* V05 loc2         [V05    ] (  0,  0   )   byref  ->  zero-ref   
+;* V06 loc3         [V06    ] (  0,  0   )    long  ->  zero-ref   
+;* V07 loc4         [V07    ] (  0,  0   )    long  ->  zero-ref   
+;* V08 loc5         [V08    ] (  0,  0   )    long  ->  zero-ref   
+;* V09 loc6         [V09    ] (  0,  0   )  simd16  ->  zero-ref    <System.Runtime.Intrinsics.Vector128`1[ubyte]>
+;  V10 loc7         [V10,T06] (  2,  4.50)    long  ->  rcx        
+;# V11 OutArgs      [V11    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;  V12 tmp1         [V12,T04] (  2, 16   )    long  ->  rax         "dup spill"
+;* V13 tmp2         [V13    ] (  0,  0   )  simd32  ->  zero-ref    ld-addr-op "NewObj constructor temp" <System.Numerics.Vector`1[ulong]>
+;* V14 tmp3         [V14    ] (  0,  0   )  simd32  ->  zero-ref   
+;* V15 tmp4         [V15    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;* V16 tmp5         [V16    ] (  0,  0   )  struct ( 8) zero-ref    ld-addr-op "Inline ldloca(s) first use temp" <System.Nullable`1[int]>
+;* V17 tmp6         [V17    ] (  0,  0   )  struct ( 8) zero-ref    do-not-enreg[SF] ld-addr-op "Inlining Arg" <System.Nullable`1[int]>
+;* V18 tmp7         [V18    ] (  0,  0   )    long  ->  zero-ref    ld-addr-op "Inline ldloca(s) first use temp"
+;* V19 tmp8         [V19    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;* V20 tmp9         [V20    ] (  0,  0   )  simd32  ->  zero-ref    "Inlining Arg" <System.Numerics.Vector`1[ulong]>
+;* V21 tmp10        [V21    ] (  0,  0   )   ubyte  ->  zero-ref    "field V16.hasValue (fldOffset=0x0)" P-INDEP
+;* V22 tmp11        [V22    ] (  0,  0   )     int  ->  zero-ref    "field V16.value (fldOffset=0x4)" P-INDEP
+;* V23 tmp12        [V23    ] (  0,  0   )   ubyte  ->  zero-ref    do-not-enreg[] "field V17.hasValue (fldOffset=0x0)" P-DEP
+;* V24 tmp13        [V24    ] (  0,  0   )     int  ->  zero-ref    do-not-enreg[] "field V17.value (fldOffset=0x4)" P-DEP
+;  V25 cse0         [V25,T02] (  9, 36   )    long  ->   r8         "CSE #01: aggressive"
+;  V26 cse1         [V26,T07] (  5,  2.50)    long  ->  rcx         "CSE #02: aggressive"
+;  V27 cse2         [V27,T08] (  3,  1.50)    long  ->  rcx         "CSE #03: moderate"
 ;
-; Lcl frame size = 16
+; Lcl frame size = 0
 
 G_M56207_IG01:
        push     rbp
-       sub      rsp, 16
-       lea      rbp, [rsp+0x10]
-						;; size=10 bbWeight=1 PerfScore 1.75
+       mov      rbp, rsp
+						;; size=4 bbWeight=1 PerfScore 1.25
 G_M56207_IG02:
        cmp      rsi, 4
-       jae      G_M56207_IG08
-						;; size=10 bbWeight=1 PerfScore 1.25
-G_M56207_IG03:
+       jae      G_M56207_IG12
        xor      eax, eax
        cmp      rsi, 8
        jb       SHORT G_M56207_IG05
+						;; size=18 bbWeight=1 PerfScore 2.75
+G_M56207_IG03:
        mov      rcx, rsi
        and      rcx, -8
-       align    [0 bytes for IG04]
-						;; size=15 bbWeight=0.50 PerfScore 1.00
+       align    [3 bytes for IG04]
+						;; size=10 bbWeight=0.50 PerfScore 0.38
 G_M56207_IG04:
        lea      r8, [8*rax]
        mov      qword ptr [rdi+r8], rdx
        mov      qword ptr [rdi+r8+0x08], rdx
        mov      qword ptr [rdi+r8+0x10], rdx
        mov      qword ptr [rdi+r8+0x18], rdx
        mov      qword ptr [rdi+r8+0x20], rdx
        mov      qword ptr [rdi+r8+0x28], rdx
        mov      qword ptr [rdi+r8+0x30], rdx
        mov      qword ptr [rdi+r8+0x38], rdx
        add      rax, 8
        cmp      rax, rcx
        jb       SHORT G_M56207_IG04
 						;; size=56 bbWeight=4 PerfScore 40.00
 G_M56207_IG05:
        test     sil, 4
-       je       SHORT G_M56207_IG06
+       je       SHORT G_M56207_IG07
+						;; size=6 bbWeight=1 PerfScore 1.25
+G_M56207_IG06:
        lea      rcx, [8*rax]
        mov      qword ptr [rdi+rcx], rdx
        mov      qword ptr [rdi+rcx+0x08], rdx
        mov      qword ptr [rdi+rcx+0x10], rdx
        mov      qword ptr [rdi+rcx+0x18], rdx
        add      rax, 4
-						;; size=37 bbWeight=0.50 PerfScore 3.00
-G_M56207_IG06:
+						;; size=31 bbWeight=0.50 PerfScore 2.38
+G_M56207_IG07:
        test     sil, 2
-       je       SHORT G_M56207_IG07
+       je       SHORT G_M56207_IG09
+						;; size=6 bbWeight=1 PerfScore 1.25
+G_M56207_IG08:
        lea      rcx, [8*rax]
        mov      qword ptr [rdi+rcx], rdx
        mov      qword ptr [rdi+rcx+0x08], rdx
        add      rax, 2
-						;; size=27 bbWeight=0.50 PerfScore 2.00
-G_M56207_IG07:
-       test     sil, 1
-       je       SHORT G_M56207_IG12
-       mov      qword ptr [rdi+8*rax], rdx
-       jmp      SHORT G_M56207_IG12
-       align    [0 bytes for IG09]
-						;; size=12 bbWeight=0.50 PerfScore 2.12
-G_M56207_IG08:
-       mov      qword ptr [rbp-0x08], rdx
-       vpbroadcastq ymm0, qword ptr [rbp-0x08]
-       lea      rax, [8*rsi]
-       mov      rcx, rax
-       and      rcx, -64
-       xor      edx, edx
-       cmp      rsi, 8
-       jb       SHORT G_M56207_IG10
-						;; size=33 bbWeight=0.50 PerfScore 3.75
+						;; size=21 bbWeight=0.50 PerfScore 1.38
 G_M56207_IG09:
-       vmovups  ymmword ptr [rdi+rdx], ymm0
-       vmovups  ymmword ptr [rdi+rdx+0x20], ymm0
-       add      rdx, 64
-       cmp      rdx, rcx
-       jb       SHORT G_M56207_IG09
-						;; size=20 bbWeight=4 PerfScore 22.00
-G_M56207_IG10:
-       test     al, 32
+       test     sil, 1
        je       SHORT G_M56207_IG11
-       vmovups  ymmword ptr [rdi+rdx], ymm0
-						;; size=9 bbWeight=0.50 PerfScore 1.62
+						;; size=6 bbWeight=1 PerfScore 1.25
+G_M56207_IG10:
+       mov      qword ptr [rdi+8*rax], rdx
+						;; size=4 bbWeight=0.50 PerfScore 0.50
 G_M56207_IG11:
-       vmovups  ymmword ptr [rdi+rax-0x20], ymm0
-						;; size=6 bbWeight=0.50 PerfScore 1.00
-G_M56207_IG12:
-       vzeroupper 
-       add      rsp, 16
        pop      rbp
        ret      
-						;; size=9 bbWeight=1 PerfScore 2.75
+						;; size=2 bbWeight=1 PerfScore 1.50
+G_M56207_IG12:
+       mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowNotSupportedException()
+       call     [rax]System.ThrowHelper:ThrowNotSupportedException()
+       int3     
+						;; size=13 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 244, prolog size 10, PerfScore 82.25, instruction count 63, allocated bytes for code 244 (MethodHash=b4fe2470) for method System.SpanHelpers:Fill[System.Nullable`1[int]](byref,ulong,System.Nullable`1[int]) (FullOpts)
+; Total bytes of code 177, prolog size 4, PerfScore 53.88, instruction count 44, allocated bytes for code 177 (MethodHash=b4fe2470) for method System.SpanHelpers:Fill[System.Nullable`1[int]](byref,ulong,System.Nullable`1[int]) (FullOpts)
 ; ============================================================
-58 (-42.65 % of base) - System.SpanHelpers:Fill[int](byref,ulong,int)
 ; Assembly listing for method System.SpanHelpers:Fill[int](byref,ulong,int) (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
 ; fully interruptible
 ; No PGO data
-; 0 inlinees with PGO data; 2 single block inlinees; 1 inlinees without PGO data
+; 0 inlinees with PGO data; 2 single block inlinees; 2 inlinees without PGO data
 ; Final local variable assignments
 ;
-;  V00 arg0         [V00,T02] ( 10,  6   )   byref  ->  rdi         single-def
-;  V01 arg1         [V01,T04] (  8,  5.50)    long  ->  rsi         single-def
-;  V02 arg2         [V02,T03] ( 10,  6   )     int  ->  rdx         single-def
-;  V03 loc0         [V03,T05] ( 12,  6   )    long  ->  rax        
-;* V04 loc1         [V04    ] (  0,  0   )     int  ->  zero-ref    ld-addr-op
-;  V05 loc2         [V05,T08] (  5,  9.50)  simd32  ->  mm0         ld-addr-op <System.Numerics.Vector`1[ubyte]>
-;  V06 loc3         [V06,T01] (  5,  9.50)   byref  ->  rdi         single-def
-;  V07 loc4         [V07,T07] (  4,  2   )    long  ->  rax        
-;  V08 loc5         [V08,T06] (  2,  4.50)    long  ->  rcx        
-;  V09 loc6         [V09,T00] (  7, 21   )    long  ->  rdx        
-;* V10 loc7         [V10    ] (  0,  0   )  simd16  ->  zero-ref    <System.Runtime.Intrinsics.Vector128`1[ubyte]>
-;* V11 loc8         [V11    ] (  0,  0   )    long  ->  zero-ref   
-;# V12 OutArgs      [V12    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
-;* V13 tmp1         [V13    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
-;* V14 tmp2         [V14    ] (  0,  0   )  simd32  ->  zero-ref    ld-addr-op "NewObj constructor temp" <System.Numerics.Vector`1[uint]>
-;* V15 tmp3         [V15    ] (  0,  0   )  simd32  ->  zero-ref   
-;* V16 tmp4         [V16    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
-;* V17 tmp5         [V17    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
-;* V18 tmp6         [V18    ] (  0,  0   )  simd32  ->  zero-ref    "Inlining Arg" <System.Numerics.Vector`1[uint]>
+;  V00 arg0         [V00,T01] (  9,  5.50)   byref  ->  rdi         single-def
+;  V01 arg1         [V01,T00] (  6,  6   )    long  ->  rsi         single-def
+;  V02 arg2         [V02,T02] (  9,  5.50)     int  ->  rdx         single-def
+;  V03 loc0         [V03,T03] ( 12,  6.50)    long  ->  rax        
+;* V04 loc1         [V04    ] (  0,  0   )  simd32  ->  zero-ref    ld-addr-op <System.Numerics.Vector`1[ubyte]>
+;* V05 loc2         [V05    ] (  0,  0   )   byref  ->  zero-ref   
+;* V06 loc3         [V06    ] (  0,  0   )    long  ->  zero-ref   
+;* V07 loc4         [V07    ] (  0,  0   )    long  ->  zero-ref   
+;* V08 loc5         [V08    ] (  0,  0   )    long  ->  zero-ref   
+;* V09 loc6         [V09    ] (  0,  0   )  simd16  ->  zero-ref    <System.Runtime.Intrinsics.Vector128`1[ubyte]>
+;* V10 loc7         [V10    ] (  0,  0   )    long  ->  zero-ref   
+;# V11 OutArgs      [V11    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;* V12 tmp1         [V12    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
+;* V13 tmp2         [V13    ] (  0,  0   )  simd32  ->  zero-ref    ld-addr-op "NewObj constructor temp" <System.Numerics.Vector`1[uint]>
+;* V14 tmp3         [V14    ] (  0,  0   )  simd32  ->  zero-ref   
+;* V15 tmp4         [V15    ] (  0,  0   )   ubyte  ->  zero-ref    "Inline return value spill temp"
+;* V16 tmp5         [V16    ] (  0,  0   )     int  ->  zero-ref    ld-addr-op "Inlining Arg"
+;* V17 tmp6         [V17    ] (  0,  0   )     int  ->  zero-ref    "Inlining Arg"
+;* V18 tmp7         [V18    ] (  0,  0   )  simd32  ->  zero-ref    "Inlining Arg" <System.Numerics.Vector`1[uint]>
 ;
 ; Lcl frame size = 0
 
 G_M11887_IG01:
        push     rbp
        mov      rbp, rsp
 						;; size=4 bbWeight=1 PerfScore 1.25
 G_M11887_IG02:
        cmp      rsi, 8
-       jae      SHORT G_M11887_IG06
-						;; size=6 bbWeight=1 PerfScore 1.25
-G_M11887_IG03:
+       jae      SHORT G_M11887_IG09
        xor      eax, eax
        test     sil, 4
        je       SHORT G_M11887_IG04
+						;; size=14 bbWeight=1 PerfScore 2.75
+G_M11887_IG03:
        mov      dword ptr [rdi+4*rax], edx
        mov      dword ptr [rdi+4*rax+0x04], edx
        mov      dword ptr [rdi+4*rax+0x08], edx
        mov      dword ptr [rdi+4*rax+0x0C], edx
        add      rax, 4
-						;; size=27 bbWeight=0.50 PerfScore 2.88
+						;; size=19 bbWeight=0.50 PerfScore 2.12
 G_M11887_IG04:
        test     sil, 2
-       je       SHORT G_M11887_IG05
+       je       SHORT G_M11887_IG06
+						;; size=6 bbWeight=1 PerfScore 1.25
+G_M11887_IG05:
        mov      dword ptr [rdi+4*rax], edx
        mov      dword ptr [rdi+4*rax+0x04], edx
        add      rax, 2
-						;; size=17 bbWeight=0.50 PerfScore 1.75
-G_M11887_IG05:
-       test     sil, 1
-       je       SHORT G_M11887_IG10
-       mov      dword ptr [rdi+4*rax], edx
-       jmp      SHORT G_M11887_IG10
-       align    [2 bytes for IG07]
-						;; size=13 bbWeight=0.50 PerfScore 2.12
+						;; size=11 bbWeight=0.50 PerfScore 1.12
 G_M11887_IG06:
-       vpbroadcastd ymm0, edx
-       lea      rax, [4*rsi]
-       mov      rcx, rax
-       and      rcx, -64
-       xor      edx, edx
-       cmp      rsi, 16
-       jb       SHORT G_M11887_IG08
-						;; size=29 bbWeight=0.50 PerfScore 2.25
+       test     sil, 1
+       je       SHORT G_M11887_IG08
+						;; size=6 bbWeight=1 PerfScore 1.25
 G_M11887_IG07:
-       vmovups  ymmword ptr [rdi+rdx], ymm0
-       vmovups  ymmword ptr [rdi+rdx+0x20], ymm0
-       add      rdx, 64
-       cmp      rdx, rcx
-       jb       SHORT G_M11887_IG07
-						;; size=20 bbWeight=4 PerfScore 22.00
+       mov      dword ptr [rdi+4*rax], edx
+						;; size=3 bbWeight=0.50 PerfScore 0.50
 G_M11887_IG08:
-       test     al, 32
-       je       SHORT G_M11887_IG09
-       vmovups  ymmword ptr [rdi+rdx], ymm0
-						;; size=9 bbWeight=0.50 PerfScore 1.62
-G_M11887_IG09:
-       vmovups  ymmword ptr [rdi+rax-0x20], ymm0
-						;; size=6 bbWeight=0.50 PerfScore 1.00
-G_M11887_IG10:
-       vzeroupper 
        pop      rbp
        ret      
-						;; size=5 bbWeight=1 PerfScore 2.50
+						;; size=2 bbWeight=1 PerfScore 1.50
+G_M11887_IG09:
+       mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowNotSupportedException()
+       call     [rax]System.ThrowHelper:ThrowNotSupportedException()
+       int3     
+						;; size=13 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 136, prolog size 4, PerfScore 38.62, instruction count 41, allocated bytes for code 136 (MethodHash=2dc3d190) for method System.SpanHelpers:Fill[int](byref,ulong,int) (FullOpts)
+; Total bytes of code 78, prolog size 4, PerfScore 11.75, instruction count 25, allocated bytes for code 78 (MethodHash=2dc3d190) for method System.SpanHelpers:Fill[int](byref,ulong,int) (FullOpts)
 ; ============================================================
-8 (-3.45 % of base) - System.Array:LastIndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int
 ; Assembly listing for method System.Array:LastIndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
-; partially interruptible
+; fully interruptible
 ; No PGO data
 ; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data
 ; Final local variable assignments
 ;
 ;  V00 arg0         [V00,T01] (  5,  4.50)     ref  ->  r15         class-hnd single-def <System.Numerics.Vector`1[float][]>
-;  V01 arg1         [V01,T05] (  1,  0.50)  simd32  ->  [rbp+0x10]  ld-addr-op single-def <System.Numerics.Vector`1[float]>
+;  V01 arg1         [V01,T05] (  1,  0.50)  simd32  ->  [rbp+0x10]  single-def <System.Numerics.Vector`1[float]>
 ;  V02 arg2         [V02,T00] (  7,  4.50)     int  ->  rbx         single-def
 ;  V03 arg3         [V03,T02] (  6,  4   )     int  ->  r14         single-def
 ;* V04 loc0         [V04    ] (  0,  0   )     int  ->  zero-ref   
 ;* V05 loc1         [V05    ] (  0,  0   )     int  ->  zero-ref   
 ;* V06 loc2         [V06    ] (  0,  0   )     int  ->  zero-ref   
 ;* V07 loc3         [V07    ] (  0,  0   )     int  ->  zero-ref   
 ;* V08 loc4         [V08    ] (  0,  0   )     int  ->  zero-ref   
 ;* V09 loc5         [V09    ] (  0,  0   )     int  ->  zero-ref   
 ;* V10 loc6         [V10    ] (  0,  0   )     int  ->  zero-ref   
 ;* V11 loc7         [V11    ] (  0,  0   )     int  ->  zero-ref   
-;  V12 OutArgs      [V12    ] (  1,  1   )  struct (32) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;# V12 OutArgs      [V12    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;  V13 tmp1         [V13,T04] (  2,  2   )     ref  ->  rdi         single-def "argument with side effect"
 ;  V14 cse0         [V14,T03] (  3,  2.50)     int  ->  rdi         "CSE #01: aggressive"
 ;
-; Lcl frame size = 40
+; Lcl frame size = 8
 
 G_M26696_IG01:
        push     rbp
        push     r15
        push     r14
        push     rbx
-       sub      rsp, 40
-       lea      rbp, [rsp+0x40]
+       push     rax
+       lea      rbp, [rsp+0x20]
        mov      r15, rdi
        mov      ebx, esi
        mov      r14d, edx
-						;; size=23 bbWeight=1 PerfScore 5.50
+						;; size=20 bbWeight=1 PerfScore 6.25
 G_M26696_IG02:
        test     r15, r15
-       je       G_M26696_IG12
+       je       G_M26696_IG13
        mov      edi, dword ptr [r15+0x08]
        test     edi, edi
-       je       SHORT G_M26696_IG06
+       je       SHORT G_M26696_IG07
 						;; size=17 bbWeight=1 PerfScore 4.50
 G_M26696_IG03:
        cmp      edi, ebx
-       jbe      G_M26696_IG11
+       jbe      G_M26696_IG12
        mov      edi, ebx
        sub      edi, r14d
        inc      edi
        or       edi, r14d
-       jl       G_M26696_IG10
+       jl       SHORT G_M26696_IG11
        mov      rdi, 0xD1FFAB1E      ; global ptr
        test     byte  ptr [rdi], 1
-       je       SHORT G_M26696_IG09
-						;; size=39 bbWeight=0.50 PerfScore 3.75
+       je       SHORT G_M26696_IG10
+						;; size=35 bbWeight=0.50 PerfScore 3.75
 G_M26696_IG04:
        mov      rdi, 0xD1FFAB1E      ; data for System.Collections.Generic.EqualityComparer`1[System.Numerics.Vector`1[float]]:<Default>k__BackingField
        mov      rdi, gword ptr [rdi]
+						;; size=13 bbWeight=0.50 PerfScore 1.12
+G_M26696_IG05:
        vmovups  ymm0, ymmword ptr [rbp+0x10]
-       vmovups  ymmword ptr [rsp], ymm0
+       vmovups  ymmword ptr [rbp+0x10], ymm0
        mov      rsi, r15
        mov      edx, ebx
        mov      ecx, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Vector`1[float]]:LastIndexOf(System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int:this
-       call     [rax]System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Vector`1[float]]:LastIndexOf(System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int:this
-       nop      
-						;; size=44 bbWeight=0.50 PerfScore 5.75
-G_M26696_IG05:
-       add      rsp, 40
+						;; size=28 bbWeight=0.50 PerfScore 3.00
+G_M26696_IG06:
+       add      rsp, 8
        pop      rbx
        pop      r14
        pop      r15
        pop      rbp
-       ret      
-						;; size=11 bbWeight=0.50 PerfScore 1.62
-G_M26696_IG06:
-       cmp      ebx, -1
-       je       SHORT G_M26696_IG07
-       test     ebx, ebx
-       jne      SHORT G_M26696_IG11
-						;; size=9 bbWeight=0.50 PerfScore 1.25
+       tail.jmp [rax]System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Vector`1[float]]:LastIndexOf(System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int:this
+						;; size=13 bbWeight=0.50 PerfScore 2.12
 G_M26696_IG07:
+       cmp      ebx, -1
+       je       SHORT G_M26696_IG08
+       test     ebx, ebx
+       jne      SHORT G_M26696_IG12
+						;; size=9 bbWeight=0.50 PerfScore 1.25
+G_M26696_IG08:
        test     r14d, r14d
-       jne      SHORT G_M26696_IG10
+       jne      SHORT G_M26696_IG11
        mov      eax, -1
 						;; size=10 bbWeight=0.50 PerfScore 0.75
-G_M26696_IG08:
-       add      rsp, 40
+G_M26696_IG09:
+       add      rsp, 8
        pop      rbx
        pop      r14
        pop      r15
        pop      rbp
        ret      
 						;; size=11 bbWeight=0.50 PerfScore 1.62
-G_M26696_IG09:
+G_M26696_IG10:
        mov      rdi, 0xD1FFAB1E      ; System.Collections.Generic.EqualityComparer`1[System.Numerics.Vector`1[float]]
        mov      rax, 0xD1FFAB1E      ; code for CORINFO_HELP_GET_GCSTATIC_BASE
        call     [rax]CORINFO_HELP_GET_GCSTATIC_BASE
        jmp      SHORT G_M26696_IG04
 						;; size=24 bbWeight=0 PerfScore 0.00
-G_M26696_IG10:
+G_M26696_IG11:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        call     [rax]System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
-G_M26696_IG11:
+G_M26696_IG12:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLess()
        call     [rax]System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLess()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
-G_M26696_IG12:
+G_M26696_IG13:
        mov      edi, 2
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowArgumentNullException(int)
        call     [rax]System.ThrowHelper:ThrowArgumentNullException(int)
        int3     
 						;; size=18 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 232, prolog size 15, PerfScore 24.75, instruction count 67, allocated bytes for code 232 (MethodHash=a95397b7) for method System.Array:LastIndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
+; Total bytes of code 224, prolog size 20, PerfScore 24.38, instruction count 65, allocated bytes for code 224 (MethodHash=a95397b7) for method System.Array:LastIndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
 ; ============================================================
-4 (-2.13 % of base) - System.Array:IndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int
 ; Assembly listing for method System.Array:IndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
-; partially interruptible
+; fully interruptible
 ; No PGO data
 ; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data
 ; Final local variable assignments
 ;
 ;  V00 arg0         [V00,T00] (  5,  5   )     ref  ->  rbx         class-hnd single-def <System.Numerics.Vector`1[float][]>
-;  V01 arg1         [V01,T05] (  1,  1   )  simd32  ->  [rbp+0x10]  ld-addr-op single-def <System.Numerics.Vector`1[float]>
+;  V01 arg1         [V01,T05] (  1,  1   )  simd32  ->  [rbp+0x10]  single-def <System.Numerics.Vector`1[float]>
 ;  V02 arg2         [V02,T01] (  5,  5   )     int  ->  r15         single-def
 ;  V03 arg3         [V03,T02] (  4,  4   )     int  ->  r14         single-def
 ;* V04 loc0         [V04    ] (  0,  0   )     int  ->  zero-ref   
 ;* V05 loc1         [V05    ] (  0,  0   )     int  ->  zero-ref   
 ;* V06 loc2         [V06    ] (  0,  0   )     int  ->  zero-ref   
 ;* V07 loc3         [V07    ] (  0,  0   )     int  ->  zero-ref   
-;  V08 OutArgs      [V08    ] (  1,  1   )  struct (32) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
+;# V08 OutArgs      [V08    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;  V09 tmp1         [V09,T03] (  2,  4   )     ref  ->  rdi         single-def "argument with side effect"
 ;  V10 cse0         [V10,T04] (  3,  3   )     int  ->  rdi         "CSE #01: aggressive"
 ;
-; Lcl frame size = 40
+; Lcl frame size = 8
 
 G_M52482_IG01:
        push     rbp
        push     r15
        push     r14
        push     rbx
-       sub      rsp, 40
-       lea      rbp, [rsp+0x40]
+       push     rax
+       lea      rbp, [rsp+0x20]
        mov      rbx, rdi
        mov      r15d, esi
        mov      r14d, edx
-						;; size=24 bbWeight=1 PerfScore 5.50
+						;; size=21 bbWeight=1 PerfScore 6.25
 G_M52482_IG02:
        test     rbx, rbx
-       je       G_M52482_IG08
+       je       G_M52482_IG09
        mov      edi, dword ptr [rbx+0x08]
        cmp      edi, r15d
-       jb       SHORT G_M52482_IG07
+       jb       SHORT G_M52482_IG08
        sub      edi, r15d
        cmp      edi, r14d
-       jb       SHORT G_M52482_IG06
+       jb       SHORT G_M52482_IG07
        mov      rdi, 0xD1FFAB1E      ; global ptr
        test     byte  ptr [rdi], 1
-       je       SHORT G_M52482_IG05
+       je       SHORT G_M52482_IG06
 						;; size=40 bbWeight=1 PerfScore 10.25
 G_M52482_IG03:
        mov      rdi, 0xD1FFAB1E      ; data for System.Collections.Generic.EqualityComparer`1[System.Numerics.Vector`1[float]]:<Default>k__BackingField
        mov      rdi, gword ptr [rdi]
+						;; size=13 bbWeight=1 PerfScore 2.25
+G_M52482_IG04:
        vmovups  ymm0, ymmword ptr [rbp+0x10]
-       vmovups  ymmword ptr [rsp], ymm0
+       vmovups  ymmword ptr [rbp+0x10], ymm0
        mov      rsi, rbx
        mov      edx, r15d
        mov      ecx, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Vector`1[float]]:IndexOf(System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int:this
-       call     [rax]System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Vector`1[float]]:IndexOf(System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int:this
-       nop      
-						;; size=45 bbWeight=1 PerfScore 11.50
-G_M52482_IG04:
-       add      rsp, 40
+						;; size=29 bbWeight=1 PerfScore 6.00
+G_M52482_IG05:
+       add      rsp, 8
        pop      rbx
        pop      r14
        pop      r15
        pop      rbp
-       ret      
-						;; size=11 bbWeight=1 PerfScore 3.25
-G_M52482_IG05:
+       tail.jmp [rax]System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Vector`1[float]]:IndexOf(System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int:this
+						;; size=13 bbWeight=1 PerfScore 4.25
+G_M52482_IG06:
        mov      rdi, 0xD1FFAB1E      ; System.Collections.Generic.EqualityComparer`1[System.Numerics.Vector`1[float]]
        mov      rax, 0xD1FFAB1E      ; code for CORINFO_HELP_GET_GCSTATIC_BASE
        call     [rax]CORINFO_HELP_GET_GCSTATIC_BASE
        jmp      SHORT G_M52482_IG03
 						;; size=24 bbWeight=0 PerfScore 0.00
-G_M52482_IG06:
+G_M52482_IG07:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        call     [rax]System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
-G_M52482_IG07:
+G_M52482_IG08:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLessOrEqual()
        call     [rax]System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLessOrEqual()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
-G_M52482_IG08:
+G_M52482_IG09:
        mov      edi, 2
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowArgumentNullException(int)
        call     [rax]System.ThrowHelper:ThrowArgumentNullException(int)
        int3     
 						;; size=18 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 188, prolog size 15, PerfScore 30.50, instruction count 50, allocated bytes for code 188 (MethodHash=973a32fd) for method System.Array:IndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
+; Total bytes of code 184, prolog size 21, PerfScore 29.00, instruction count 48, allocated bytes for code 184 (MethodHash=973a32fd) for method System.Array:IndexOf[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float][],System.Numerics.Vector`1[float],int,int):int (FullOpts)
 ; ============================================================
-1 (-0.54 % of base) - System.Array:IndexOf[double](double[],double,int,int):int
 ; Assembly listing for method System.Array:IndexOf[double](double[],double,int,int):int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
-; partially interruptible
+; fully interruptible
 ; No PGO data
 ; 0 inlinees with PGO data; 4 single block inlinees; 1 inlinees without PGO data
 ; Final local variable assignments
 ;
 ;  V00 arg0         [V00,T00] (  5,  5   )     ref  ->  rbx         class-hnd single-def <double[]>
-;  V01 arg1         [V01,T05] (  3,  3   )  double  ->  [rbp-0x20]  ld-addr-op single-def
+;  V01 arg1         [V01,T05] (  3,  3   )  double  ->  [rbp-0x20]  single-def
 ;  V02 arg2         [V02,T01] (  5,  5   )     int  ->  r15         single-def
 ;  V03 arg3         [V03,T02] (  4,  4   )     int  ->  r14         single-def
 ;* V04 loc0         [V04    ] (  0,  0   )     int  ->  zero-ref   
 ;* V05 loc1         [V05    ] (  0,  0   )     int  ->  zero-ref   
 ;* V06 loc2         [V06    ] (  0,  0   )     int  ->  zero-ref   
 ;* V07 loc3         [V07    ] (  0,  0   )     int  ->  zero-ref   
 ;# V08 OutArgs      [V08    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;* V09 tmp1         [V09    ] (  0,  0   )     int  ->  zero-ref   
 ;* V10 tmp2         [V10    ] (  0,  0   )   byref  ->  zero-ref    "Inlining Arg"
 ;* V11 tmp3         [V11    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
 ;* V12 tmp4         [V12    ] (  0,  0   )     int  ->  zero-ref    "Inline return value spill temp"
-;  V13 tmp5         [V13,T03] (  2,  4   )     ref  ->  rdi         single-def "argument with side effect"
-;  V14 cse0         [V14,T04] (  3,  3   )     int  ->  rdi         "CSE #01: aggressive"
+;* V13 tmp5         [V13    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
+;  V14 tmp6         [V14,T03] (  2,  4   )     ref  ->  rdi         single-def "argument with side effect"
+;  V15 cse0         [V15,T04] (  3,  3   )     int  ->  rdi         "CSE #01: aggressive"
 ;
 ; Lcl frame size = 8
 
 G_M9300_IG01:
        push     rbp
        push     r15
        push     r14
        push     rbx
        push     rax
        lea      rbp, [rsp+0x20]
        vmovsd   qword ptr [rbp-0x20], xmm0
        mov      rbx, rdi
        mov      r15d, esi
        mov      r14d, edx
 						;; size=26 bbWeight=1 PerfScore 7.25
 G_M9300_IG02:
        test     rbx, rbx
        je       G_M9300_IG08
        mov      edi, dword ptr [rbx+0x08]
        cmp      edi, r15d
        jb       SHORT G_M9300_IG07
        sub      edi, r15d
        cmp      edi, r14d
        jb       SHORT G_M9300_IG06
        mov      rdi, 0xD1FFAB1E      ; global ptr
        test     byte  ptr [rdi], 1
        je       SHORT G_M9300_IG05
 						;; size=40 bbWeight=1 PerfScore 10.25
 G_M9300_IG03:
        mov      rdi, 0xD1FFAB1E      ; data for System.Collections.Generic.EqualityComparer`1[double]:<Default>k__BackingField
        mov      rdi, gword ptr [rdi]
        mov      rsi, rbx
        vmovsd   xmm0, qword ptr [rbp-0x20]
        mov      edx, r15d
        mov      ecx, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Collections.Generic.GenericEqualityComparer`1[double]:IndexOf(double[],double,int,int):int:this
-       call     [rax]System.Collections.Generic.GenericEqualityComparer`1[double]:IndexOf(double[],double,int,int):int:this
-       nop      
-						;; size=40 bbWeight=1 PerfScore 9.50
+						;; size=37 bbWeight=1 PerfScore 6.25
 G_M9300_IG04:
        add      rsp, 8
        pop      rbx
        pop      r14
        pop      r15
        pop      rbp
-       ret      
-						;; size=11 bbWeight=1 PerfScore 3.25
+       tail.jmp [rax]System.Collections.Generic.GenericEqualityComparer`1[double]:IndexOf(double[],double,int,int):int:this
+						;; size=13 bbWeight=1 PerfScore 4.25
 G_M9300_IG05:
        mov      rdi, 0xD1FFAB1E      ; System.Collections.Generic.EqualityComparer`1[double]
        mov      rax, 0xD1FFAB1E      ; code for CORINFO_HELP_GET_GCSTATIC_BASE
        call     [rax]CORINFO_HELP_GET_GCSTATIC_BASE
        jmp      SHORT G_M9300_IG03
 						;; size=24 bbWeight=0 PerfScore 0.00
 G_M9300_IG06:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        call     [rax]System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
 G_M9300_IG07:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLessOrEqual()
        call     [rax]System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLessOrEqual()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
 G_M9300_IG08:
        mov      edi, 2
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowArgumentNullException(int)
        call     [rax]System.ThrowHelper:ThrowArgumentNullException(int)
        int3     
 						;; size=18 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 185, prolog size 12, PerfScore 30.25, instruction count 50, allocated bytes for code 185 (MethodHash=8343dbab) for method System.Array:IndexOf[double](double[],double,int,int):int (FullOpts)
+; Total bytes of code 184, prolog size 26, PerfScore 28.00, instruction count 48, allocated bytes for code 184 (MethodHash=8343dbab) for method System.Array:IndexOf[double](double[],double,int,int):int (FullOpts)
 ; ============================================================
-1 (-0.44 % of base) - System.Array:LastIndexOf[double](double[],double,int,int):int
 ; Assembly listing for method System.Array:LastIndexOf[double](double[],double,int,int):int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX512 - Unix
 ; FullOpts code
 ; optimized code
 ; rbp based frame
-; partially interruptible
+; fully interruptible
 ; No PGO data
 ; 0 inlinees with PGO data; 3 single block inlinees; 0 inlinees without PGO data
 ; Final local variable assignments
 ;
 ;  V00 arg0         [V00,T01] (  5,  4.50)     ref  ->  r15         class-hnd single-def <double[]>
-;  V01 arg1         [V01,T05] (  3,  2.50)  double  ->  [rbp-0x20]  ld-addr-op single-def
+;  V01 arg1         [V01,T05] (  3,  2.50)  double  ->  [rbp-0x20]  single-def
 ;  V02 arg2         [V02,T00] (  7,  4.50)     int  ->  rbx         single-def
 ;  V03 arg3         [V03,T02] (  6,  4   )     int  ->  r14         single-def
 ;* V04 loc0         [V04    ] (  0,  0   )     int  ->  zero-ref   
 ;* V05 loc1         [V05    ] (  0,  0   )     int  ->  zero-ref   
 ;* V06 loc2         [V06    ] (  0,  0   )     int  ->  zero-ref   
 ;* V07 loc3         [V07    ] (  0,  0   )     int  ->  zero-ref   
 ;* V08 loc4         [V08    ] (  0,  0   )     int  ->  zero-ref   
 ;* V09 loc5         [V09    ] (  0,  0   )     int  ->  zero-ref   
 ;* V10 loc6         [V10    ] (  0,  0   )     int  ->  zero-ref   
 ;* V11 loc7         [V11    ] (  0,  0   )     int  ->  zero-ref   
 ;# V12 OutArgs      [V12    ] (  1,  1   )  struct ( 0) [rsp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;* V13 tmp1         [V13    ] (  0,  0   )     int  ->  zero-ref   
 ;* V14 tmp2         [V14    ] (  0,  0   )   byref  ->  zero-ref    "Inlining Arg"
 ;* V15 tmp3         [V15    ] (  0,  0   )    long  ->  zero-ref    "Inlining Arg"
 ;  V16 tmp4         [V16,T04] (  2,  2   )     ref  ->  rdi         single-def "argument with side effect"
 ;  V17 cse0         [V17,T03] (  3,  2.50)     int  ->  rdi         "CSE #01: aggressive"
 ;
 ; Lcl frame size = 8
 
 G_M61022_IG01:
        push     rbp
        push     r15
        push     r14
        push     rbx
        push     rax
        lea      rbp, [rsp+0x20]
        vmovsd   qword ptr [rbp-0x20], xmm0
        mov      r15, rdi
        mov      ebx, esi
        mov      r14d, edx
 						;; size=25 bbWeight=1 PerfScore 7.25
 G_M61022_IG02:
        test     r15, r15
        je       G_M61022_IG12
        mov      edi, dword ptr [r15+0x08]
        test     edi, edi
        je       SHORT G_M61022_IG06
 						;; size=17 bbWeight=1 PerfScore 4.50
 G_M61022_IG03:
        cmp      edi, ebx
        jbe      G_M61022_IG11
        mov      edi, ebx
        sub      edi, r14d
        inc      edi
        or       edi, r14d
        jl       SHORT G_M61022_IG10
        mov      rdi, 0xD1FFAB1E      ; global ptr
        test     byte  ptr [rdi], 1
        je       SHORT G_M61022_IG09
 						;; size=35 bbWeight=0.50 PerfScore 3.75
 G_M61022_IG04:
        mov      rdi, 0xD1FFAB1E      ; data for System.Collections.Generic.EqualityComparer`1[double]:<Default>k__BackingField
        mov      rdi, gword ptr [rdi]
        mov      rsi, r15
        vmovsd   xmm0, qword ptr [rbp-0x20]
        mov      edx, ebx
        mov      ecx, r14d
        mov      rax, 0xD1FFAB1E      ; code for System.Collections.Generic.GenericEqualityComparer`1[double]:LastIndexOf(double[],double,int,int):int:this
-       call     [rax]System.Collections.Generic.GenericEqualityComparer`1[double]:LastIndexOf(double[],double,int,int):int:this
-       nop      
-						;; size=39 bbWeight=0.50 PerfScore 4.75
+						;; size=36 bbWeight=0.50 PerfScore 3.12
 G_M61022_IG05:
        add      rsp, 8
        pop      rbx
        pop      r14
        pop      r15
        pop      rbp
-       ret      
-						;; size=11 bbWeight=0.50 PerfScore 1.62
+       tail.jmp [rax]System.Collections.Generic.GenericEqualityComparer`1[double]:LastIndexOf(double[],double,int,int):int:this
+						;; size=13 bbWeight=0.50 PerfScore 2.12
 G_M61022_IG06:
        cmp      ebx, -1
        je       SHORT G_M61022_IG07
        test     ebx, ebx
        jne      SHORT G_M61022_IG11
 						;; size=9 bbWeight=0.50 PerfScore 1.25
 G_M61022_IG07:
        test     r14d, r14d
        jne      SHORT G_M61022_IG10
        mov      eax, -1
 						;; size=10 bbWeight=0.50 PerfScore 0.75
 G_M61022_IG08:
        add      rsp, 8
        pop      rbx
        pop      r14
        pop      r15
        pop      rbp
        ret      
 						;; size=11 bbWeight=0.50 PerfScore 1.62
 G_M61022_IG09:
        mov      rdi, 0xD1FFAB1E      ; System.Collections.Generic.EqualityComparer`1[double]
        mov      rax, 0xD1FFAB1E      ; code for CORINFO_HELP_GET_GCSTATIC_BASE
        call     [rax]CORINFO_HELP_GET_GCSTATIC_BASE
        jmp      SHORT G_M61022_IG04
 						;; size=24 bbWeight=0 PerfScore 0.00
 G_M61022_IG10:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        call     [rax]System.ThrowHelper:ThrowCountArgumentOutOfRange_ArgumentOutOfRange_Count()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
 G_M61022_IG11:
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLess()
        call     [rax]System.ThrowHelper:ThrowStartIndexArgumentOutOfRange_ArgumentOutOfRange_IndexMustBeLess()
        int3     
 						;; size=13 bbWeight=0 PerfScore 0.00
 G_M61022_IG12:
        mov      edi, 2
        mov      rax, 0xD1FFAB1E      ; code for System.ThrowHelper:ThrowArgumentNullException(int)
        call     [rax]System.ThrowHelper:ThrowArgumentNullException(int)
        int3     
 						;; size=18 bbWeight=0 PerfScore 0.00
 
-; Total bytes of code 225, prolog size 12, PerfScore 25.50, instruction count 67, allocated bytes for code 225 (MethodHash=b45111a1) for method System.Array:LastIndexOf[double](double[],double,int,int):int (FullOpts)
+; Total bytes of code 224, prolog size 25, PerfScore 24.38, instruction count 65, allocated bytes for code 224 (MethodHash=b45111a1) for method System.Array:LastIndexOf[double](double[],double,int,int):int (FullOpts)
 ; ============================================================

@MihuBot
Copy link
Owner Author

MihuBot commented Jan 4, 2025

@xtqqczze

@xtqqczze
Copy link

xtqqczze commented Jan 5, 2025

#close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants