Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT assert if a retyped simd32 is CSE'd and needs to have the upper half saved/restored #35620

Closed
tannergooding opened this issue Apr 29, 2020 · 18 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Milestone

Comments

@tannergooding
Copy link
Member

tannergooding commented Apr 29, 2020

As per #35421 (comment), it is possible for the genSIMDIntrinsic to assert for any GT_SIMD or GT_HWINTRINSIC which has been retyped if the target type hasn't been actually encountered (and therefore is missing from the SIMD handle cache).

At least at initial glance, it could likely be resolved by setting lvBaseType based on the tree->gtSimdBaseType from the GT_SIMD or GT_HWINTRINSIC (whether tree is one directly or indirectly via GT_IND, GT_OBJ, or GT_ADDR).

category:correctness
theme:hardware-intrinsics
skill-level:intermediate
cost:medium

@tannergooding
Copy link
Member Author

CC. @briansull, @CarolEidt, @echesakovMSFT

@jeffschwMSFT jeffschwMSFT added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels May 15, 2020
@BruceForstall
Copy link
Member

@tannergooding wrote in #35899:

I think [35620] should probably be kept open so that a better fix can be put in place. Ideally we would always track the relevant class handle so that we don't have to lie about the type.

@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label May 18, 2020
@BruceForstall BruceForstall added this to the Future milestone May 18, 2020
@CarolEidt
Copy link
Contributor

We now have a ClassLayout* on GenTreeJitIntrinsic (parent of both GenTreeSIMD and GenTreeHWIntrinsic. I added it, but haven't made the changes necessary to ensure that it gets set appropriately.
One wrinkle is that the JIT will sometimes create a new intrinsic node that produces a different struct type (e.g. when expanding a "helper" intrinsic), and it's not clear that we can come up with the right handle in all cases.

@tannergooding
Copy link
Member Author

@CarolEidt, is there anything preventing us from querying the VM for a proper handle? It would be a one time cost (given we cache the handle) and would only occur for scenarios where we've reinterpret casted the bits.

@CarolEidt CarolEidt modified the milestones: Future, 6.0.0 Oct 13, 2020
@JulieLeeMSFT JulieLeeMSFT added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 23, 2021
@JulieLeeMSFT JulieLeeMSFT removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jun 7, 2021
@echesakov
Copy link
Contributor

@tannergooding Do you know if #50832 has fixed the issue?

@tannergooding
Copy link
Member Author

I don't believe #50832 addressed this.

@JulieLeeMSFT JulieLeeMSFT added the Priority:3 Work that is nice to have label Jul 26, 2021
@JulieLeeMSFT
Copy link
Member

@briansull PTAL.

@JulieLeeMSFT JulieLeeMSFT assigned briansull and unassigned echesakov Aug 2, 2021
@briansull
Copy link
Contributor

briansull commented Aug 2, 2021

Relevant Comment from #35421

This looks to be exposing an existing bug in CSE for GT_SIMD and GT_HWINTRINSIC. Namely, in optcse we can't always propagate the class handle: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/optcse.cpp#L2649 and so when we eventually get to LinearScan::insertUpperVectorSave, we can have a TYP_UNDEF for lvBaseType: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/lsra.cpp#L6756 which causes genSIMD to then assert.

This particular "bug" isn't one in practice because saving/restoring handles the TYP_SIMD32 completely opaquely and ignores the underlying type: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/simdcodegenxarch.cpp#L641

However, if you retype a vector (from ushort to short for example) then this could surface in any number of ways ultimately resulting in a similar issue for anything using lvBaseType

In the case of GT_SIMD it was producing:

               [000369] -A----------              *  ASG       simd32 (copy)
               [000367] D------N----              +--*  LCL_VAR   simd32<System.Numerics.Vector`1[Int32]> V22 tmp11        
               [000366] -A----------              \--*  SIMD      simd32 ushort Cast
               [000365] -A----------                 \--*  SIMD      simd32 short |
               [000363] -A----------                    +--*  SIMD      simd32 short eq
               [000358] -A----------                    |  +--*  COMMA     simd32
               [000356] -A----------                    |  |  +--*  ASG       simd32 (copy)
               [000354] D------N----                    |  |  |  +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V21 tmp10        
               [000352] -A-----N----                    |  |  |  \--*  SIMD      simd32 ushort -
               [000343] ------------                    |  |  |     +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V19 tmp8         
               [000350] -A----------                    |  |  |     \--*  COMMA     simd32
               [000348] -A----------                    |  |  |        +--*  ASG       simd32 (copy)
               [000346] D------N----                    |  |  |        |  +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V20 tmp9         
               [000345] -------N----                    |  |  |        |  \--*  SIMD      simd32 int init
               [000344] ------------                    |  |  |        |     \--*  CNS_INT   int    0x80008000
               [000349] ------------                    |  |  |        \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V20 tmp9         
               [000357] ------------                    |  |  \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V21 tmp10        
               [000353] ------------                    |  \--*  SIMD      simd32 ushort -
               [000189] ------------                    |     +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V12 tmp1         
               [000351] ------------                    |     \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V20 tmp9         
               [000364] ------------                    \--*  SIMD      simd32 short gt
               [000359] ------------                       +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V21 tmp10        
               [000360] ------------                       \--*  SIMD      simd32 ushort -
               [000361] ------------                          +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V12 tmp1         
               [000362] ------------                          \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V20 tmp9 

And you can see that the |, eq, init, and gt all use base types different from the original input (ushort)

While for the new code:

               [000361] -A----------              *  ASG       simd32 (copy)
               [000359] D------N----              +--*  LCL_VAR   simd32<System.Numerics.Vector`1[Int32]> V21 tmp10        
               [000358] ------------              \--*  SIMD      simd32 ushort Cast
               [000357] ------------                 \--*  HWINTRINSIC simd32 ushort Or
               [000355] ------------                    +--*  HWINTRINSIC simd32 short CompareLessThan
               [000353] ------------                    |  +--*  HWINTRINSIC simd32 short Subtract
               [000189] ------------                    |  |  +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V12 tmp1         
               [000352] ------------                    |  |  \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V20 tmp9         
               [000354] ------------                    |  \--*  HWINTRINSIC simd32 short Subtract
               [000343] ------------                    |     +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V19 tmp8         
               [000351] ------------                    |     \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V20 tmp9         
               [000356] ------------                    \--*  HWINTRINSIC simd32 ushort CompareEqual
               [000344] ------------                       +--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V12 tmp1         
               [000345] ------------                       \--*  LCL_VAR   simd32<System.Numerics.Vector`1[UInt16]> V19 tmp8 

We are doing it for CompareLessThan and Subtract.

We ultimately have to be able to retype as certain operations only exist for certain types on some platforms (e.g. CompareGreaterThan on integrals for x86 is only for signed, so we have to adjust the inputs for unsigned).
I can work around the issue just for this PR by keeping Subtract as ushort, but it won't fix the more general issue that exists.

@briansull
Copy link
Contributor

briansull commented Aug 2, 2021

This other comment in #35899 is also relevant:

tannergooding commented on May 6, 2020
This also fixes #35620, but I think it should probably be kept open so that a better fix can be put in place. Ideally we would always track the relevant class handle so that we don't have to lie about the type.

I mentioned a possible alternative here, but I believe they are all more involved changes.

@briansull
Copy link
Contributor

It doesn't seem to me to be a good time to introduce a change like this, without an active bug that is being fixed by this code change.

As the original issue that this issue dealt with was fixed by #35899

We should either close this issue or move it to 7.0

@tannergooding What do you think?

@tannergooding
Copy link
Member Author

I think it should be moved to 7 as I agree that it doesn't represent a known issue today. However, I don't think it should be closed as we will ideally fix the handling of these to avoid any downstream issues or performance gaps caused by retyping.

@JulieLeeMSFT
Copy link
Member

Moving to .NET 7.

@JulieLeeMSFT JulieLeeMSFT modified the milestones: 6.0.0, 7.0.0 Aug 14, 2021
@jakobbotsch
Copy link
Member

@tannergooding Might this one have been fixed by #38484?

@jakobbotsch
Copy link
Member

ping @tannergooding

@tannergooding
Copy link
Member Author

Just reading the issue vs PR description I wouldn't expect it. This issue is around a retyped TYP_SIMD32 and the assert fix was over TYP_SIMD12.

I'd expect this issue to be caused by a TYP_SIMD32 float being retyped as a TYP_SIMD32 int32 instead, for example. If this was fixed by anything I would expect it may have been fixed by some of the fixes @SingleAccretion has been doing around type handles.

@jakobbotsch
Copy link
Member

I was basing that on your original description which is:

Namely, in optcse we can't always propagate the class handle: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/optcse.cpp#L2649 and so when we eventually get to LinearScan::insertUpperVectorSave, we can have a TYP_UNDEF for lvBaseType: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/lsra.cpp#L6756 which causes genSIMD to then assert.

Those links are broken but pointed to the relevant code that @sandreenko changed in #38484, where he made CSE always propagate the class handle.

@tannergooding
Copy link
Member Author

Ah, it might've fixed it then if that PR causes the handle to always be available.

I think more generally we probably want LSRA to not be dependent on the class handle. The underlying type of a TYP_SIMD32 doesn't matter for save/restore upper. We just care that the raw bits get preserved.

@jakobbotsch
Copy link
Member

Given the change in #38484 and the code added in #35899 I will consider this fixed.

@ghost ghost locked as resolved and limited conversation to collaborators Jul 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:3 Work that is nice to have
Projects
None yet
Development

No branches or pull requests

8 participants