-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loop unrolling support in RyuJIT #4248
Comments
The x86 JIT on which RyuJIT did some unrolling. As far as I can tell the code is still there but it doesn't run - see for (int i = 0; i < 3; i++)
sum += i; generates
Good loop unrolling isn't trivial and I doubt that the existing unrolling code can be significantly improved. |
Forgive the dumb question (I'm trying to learn about the JIT), but what would you expect it to generate? Something like this (or whatever the correct assembly is for adding 3):
Or is that too much to expect? |
Here are the asm listings of the method [MethodImpl(MethodImplOptions.NoInlining)]
public int Run()
{
int sum = 0;
for (int i = 0; i < 3; i++)
sum += i;
return sum;
} for different JIT versions: LegacyJIT-x86: 00F33562 in al,dx
00F33563 xor eax,eax
00F33565 inc eax
00F33566 inc eax
00F33567 inc eax
00F33568 pop ebp
00F33569 ret LegacyJIT-x64: 00007FF914114470 mov eax,3
00007FF914114475 ret RyuJIT-x64 RC: 00007FF9140F4230 xor eax,eax
00007FF9140F4232 xor edx,edx
00007FF9140F4234 add eax,edx
00007FF9140F4236 inc edx
00007FF9140F4238 cmp edx,3
00007FF9140F423B jl 00007FF9140F4234
00007FF9140F423D ret |
@mattwarren Yes, And in case that you wonder how come 3 increment instructions were produced: the loop got unrolled as: @AndreyAkinshin Your LegacyJIT-x86 starts with the wrong instruction |
@mikedn, Thanks, it explains a lot! |
/cc @briansull @schellap |
An interesting sample: public int Run()
{
int sum = 0;
for (int i = 0; i < 8; i++)
sum = i;
return sum;
} LegacyJIT: L0000: push ebp
L0001: mov ebp, esp
L0003: mov eax, 0x7
L0008: pop ebp
L0009: ret RyuJIT: L0000: push ebp
L0001: mov ebp, esp
L0003: xor eax, eax
L0005: lea edx, [eax+1]
L0008: cmp edx, 8
L000b: jl short L000f
L000d: pop ebp
L000e: ret
L000f: mov eax, edx
L0011: jmp short L0005 Anyway, hoping that loop unrolling and auto-vectorize in RyuJIT can be implemented ASAP :) |
Only a few tests in the tree cause loop unrolling to kick in, since the current heuristic requires a constant loop over a SIMD vector length:
with
(and |
LegacyJIT-x64 can unroll some loops and transform something like
to something like
Also LegacyJIT-x64 can transform small loops like
to
I like this feature because it can increase performance in some cases.
Is it possible to implement loop unrolling in RyuJIT?
See also:
category:cq
theme:loop-opt
skill-level:expert
cost:large
The text was updated successfully, but these errors were encountered: