Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RyuJIT's loop cloning optimization has questionable CQ and a bug #4922

Closed
mikedn opened this issue Jan 10, 2016 · 6 comments
Closed

RyuJIT's loop cloning optimization has questionable CQ and a bug #4922

mikedn opened this issue Jan 10, 2016 · 6 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug optimization
Milestone

Comments

@mikedn
Copy link
Contributor

mikedn commented Jan 10, 2016

The following method

    static void Copy(int[] src, int[] dst, int length)
    {
        for (int i = 0; i < length; i++)
            dst[i] = src[i];
    }

generates

G_M64735_IG01:
       4883EC28             sub      rsp, 40
G_M64735_IG02:
       33C0                 xor      eax, eax

; check if length is > 0, ok
       4585C0               test     r8d, r8d
       0F8E82000000         jle      G_M64735_IG06

; check if src and dst are non-null
; not needed because we already know that at least one loop iteration will be executed
       4885C9               test     rcx, rcx
       410F95C1             setne    r9b
       450FB6C9             movzx    r9, r9b
       4885D2               test     rdx, rdx
       410F95C2             setne    r10b
       450FB6D2             movzx    r10, r10b
       4523CA               and      r9d, r10d
       4585C9               test     r9d, r9d
       7445                 je       SHORT G_M64735_IG05

G_M64735_IG03:
; hoisted bound checks, ok
       44394208             cmp      dword ptr [rdx+8], r8d
       410F9DC1             setge    r9b
       450FB6C9             movzx    r9, r9b
       44394108             cmp      dword ptr [rcx+8], r8d
       410F9DC2             setge    r10b
       450FB6D2             movzx    r10, r10b

; check if length is >= 0, dubious
; this doesn't seem right, unlike the rest of the code this checks r8 instead of r8d
; and the length was already checked at the start of the method
       4D85C0               test     r8, r8
       410F9DC3             setge    r11b
       450FB6DB             movzx    r11, r11b
       4523D3               and      r10d, r11d

       4523CA               and      r9d, r10d
       4585C9               test     r9d, r9d
       7417                 je       SHORT G_M64735_IG05

G_M64735_IG04:
       4C63C8               movsxd   r9d, eax
       468B548910           mov      r10d, dword ptr [rcx+4*r9+16]
       4689548A10           mov      dword ptr [rdx+4*r9+16], r10d
       FFC0                 inc      eax
       413BC0               cmp      eax, r8d
       7CEC                 jl       SHORT G_M64735_IG04
; it's unfortunate that the loop version that is normally executed 
; is the one requiring a jmp, IG04 and IG05 should be reversed
       EB20                 jmp      SHORT G_M64735_IG06

G_M64735_IG05:
       3B4108               cmp      eax, dword ptr [rcx+8]
       731F                 jae      SHORT G_M64735_IG07
       4C63C8               movsxd   r9d, eax
       468B548910           mov      r10d, dword ptr [rcx+4*r9+16]
       3B4208               cmp      eax, dword ptr [rdx+8]
       7312                 jae      SHORT G_M64735_IG07
       4689548A10           mov      dword ptr [rdx+4*r9+16], r10d
       FFC0                 inc      eax
       413BC0               cmp      eax, r8d
       7CE2                 jl       SHORT G_M64735_IG05

G_M64735_IG06:
       4883C428             add      rsp, 40
       C3                   ret

G_M64735_IG07:
       E8376AD45E           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3

@JanielS speculated in #4921 that the null checks may be there so the NullReferenceException will appears as if it was thrown from inside the loop instead of being thrown form hoisted code. But if you pass a null array to this method the debugger will point to the i < length loop condition when the exception is thrown, not to dst[i] = src[i].

@mikedn
Copy link
Contributor Author

mikedn commented Jan 10, 2016

I took a quick look at the code and my initial impression is that this happens because the optimizer works on do-while loops and as such it doesn't "see" the i < length check at the start of the function.

But then the CQ is a minor issue, the real problem is

[MethodImpl(MethodImplOptions.NoInlining)]
static void Copy(int[] src, int[] dst, int length)
{
    int i = 50000;
    do
    {
        dst[i] = src[i];
        i++;
    }
    while (i < length);
}

static void Main()
{
    Copy(new int[2], new int[2], 1);
}

which promptly throws an AccessViolationException

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Program.Copy(Int32[] src, Int32[] dst, Int32 length)
   at Program.Main()

@mikedn mikedn changed the title RyuJIT's loop cloning optimization has questionable CQ RyuJIT's loop cloning optimization has questionable CQ and a bug Jan 10, 2016
@jakobbotsch
Copy link
Member

I cannot repro that in .NET 4.6.1, I get an IndexOutOfRangeException as expected. Code generated is:

00007ffc`a8fc0570 56              push    rsi
00007ffc`a8fc0571 4883ec20        sub     rsp,20h
00007ffc`a8fc0575 b850c30000      mov     eax,0C350h
00007ffc`a8fc057a 448b4908        mov     r9d,dword ptr [rcx+8]
00007ffc`a8fc057e 448b5208        mov     r10d,dword ptr [rdx+8]

00007ffc`a8fc0582 413bc1          cmp     eax,r9d
00007ffc`a8fc0585 7322            jae     00007ffc`a8fc05a9

00007ffc`a8fc0587 4c63d8          movsxd  r11,eax
00007ffc`a8fc058a 468b5c9910      mov     r11d,dword ptr [rcx+r11*4+10h]
00007ffc`a8fc058f 413bc2          cmp     eax,r10d
00007ffc`a8fc0592 7315            jae     00007ffc`a8fc05a9

00007ffc`a8fc0594 4863f0          movsxd  rsi,eax
00007ffc`a8fc0597 44895cb210      mov     dword ptr [rdx+rsi*4+10h],r11d
00007ffc`a8fc059c ffc0            inc     eax
00007ffc`a8fc059e 413bc0          cmp     eax,r8d
00007ffc`a8fc05a1 7cdf            jl      00007ffc`a8fc0582

00007ffc`a8fc05a3 4883c420        add     rsp,20h
00007ffc`a8fc05a7 5e              pop     rsi
00007ffc`a8fc05a8 c3              ret

00007ffc`a8fc05a9 e85a14a85f      call    clr!JIT_RngChkFail (00007ffd`08a41a08)
00007ffc`a8fc05ae cc              int     3

It does not look like RyuJIT in .NET 4.6.1 does loop cloning for this case? Seems weird since history of the Git repo seems to indicate that loop cloning received no changes after .NET 4.6.1...

EDIT: The for (int i = 0; i < length; i++) version is loop cloned correctly in .NET 4.6.1.

@mikedn
Copy link
Contributor Author

mikedn commented Jan 11, 2016

It does not look like RyuJIT in .NET 4.6.1 does loop cloning for this case? Seems weird since history of the Git repo seems to indicate that loop cloning received no changes after .NET 4.6.1...

Yes, for some reason .NET 4.6.1 doesn't do loop cloning for such do-while loops. It doesn't do it even if you change the do-while version to be equivalent to the for version:

int i = 0;
if (i < length)
{
    do
    {
        dst[i] = src[i];
        i++;
    }
    while (i < length);
}

The C# compiler generates slightly different IL for such a loop and .NET 4.6.1, unlike CoreCLR, doesn't like it. It's possible that a fix was already made in .NET 4.6.1 but it is yet to make it to this repository.

@CppStars
Copy link

@mikedn but it is a bug that the code throws an AV, right? Is this CoreCLR or .NET 4.6.1 also?

@mikedn
Copy link
Contributor Author

mikedn commented Jan 11, 2016

@CppStars Yes, it's a bug and as far as I can tell it's present only in CoreCLR. The AV is just for the "show", I intentionally picked a large initial index to trigger an observable AV. For smaller values you'll end up with silent memory corruption.

As far as I can tell this bug is triggered only by a do-while loop with a constant initial index. Those are rare circumstances I'd say.

@RussKeldorph RussKeldorph self-assigned this Jan 11, 2016
@RussKeldorph
Copy link
Contributor

@mikedn dotnet/coreclr#2627 should have resolved the correctness issue. dotnet/coreclr#2634 opened to track the CQ issue.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the 1.0.0-rc2 milestone Jan 30, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug optimization
Projects
None yet
Development

No branches or pull requests

5 participants