xplat: use xmm for float args #2716

obastemur · 2017-03-20T15:01:19Z

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* support and a test case to make sure we don't break.

This PR also improves performance on xplat jetstream

Disabling couple of additional wasm tests on xplat with a note.

edit: added another test case

obastemur · 2017-03-21T16:49:04Z

@dotnet-bot test Windows ci_slow_x64_debug please

jianchun · 2017-03-24T16:51:22Z

lib/Backend/amd64/LowererMDArch.cpp

@@ -523,7 +561,9 @@ LowererMDArch::LowerCallArgs(IR::Instr *callInstr, ushort callFlags, Js::ArgSlot
        callInstr->InsertBefore(argInstr);
        argCount++;
    }
-
+#ifndef _WIN32
+    this->xplatCallArgs.StopRecording();


This is a pretty neat idea 👍 I'm worried about the locations of Start/Stop recording though. It seems to me maybe every GetArgSlotOpnd call needs to be recorded. We use a mixture of LowerCallArgs and several other random code paths to load different args, which may all need to be homed. E.g., even the helperCallArgs loaded in LowerCall may need to be homed. Stopping here may miss some args. Please survey all calls to GetArgSlotOpnd to see if this is a potential issue.

if we count them all, we would also count the ones we endup pushing to stack. We are only interested with the ones explicitly located on registers (float and non-float) . The rest, we consider them as non-float.

even the helperCallArgs loaded in LowerCall may need to be homed.

Indeed they are but we don't need to check/store their type.

No, we only record register args, which is what you currently doing inside GetArgSlotOpnd. You are not recording stack args.

Recording all might be more clear, otherwise you have to assume this is the only place xmm args could be loaded, all others in int registers, other registers may have no recordings, making the homing code complicated... (If recording all is hard to do right, this is fine with me)

If you follow the usages of GetArgSlotOpnd you will see that there are 2 places setting argSym so the register can be anything but TyMachReg. One of them we record. Other, we don't have to since shouldHomeParams = argCount > 0 and !shouldHomeParams... So we record them all right.

jianchun · 2017-03-24T17:03:01Z

lib/Backend/amd64/LowererMDArch.cpp

+        int floatCount = this->xplatCallArgs.floatCount;
+        int argRes = min(callArgCount, static_cast<int>(XmmArgRegsCount));
+        int intCount = max(0, min(argRes - floatCount, static_cast<int>(IntArgRegsCount)));
+        int argRegs = intCount + floatCount;


I'm confused by the logic here -- could it be due to the possibility that you were not recording all args loading (my other comment)? If you recorded every arg loading, the logic here could be very simple. Every arg from position 0 to xplatCallArgs.topPosition must be in register, type recorded, and must be homed. You can then figure out the register from position and your recorded type and home it. Seems no need to count int/float registers.

I wanted to limit recording to actual registers. This way we know / store only the ones make sense for our homing.

Assuming you can guarantee that the partial recording successfully records every float register arg, the logic here seems quite complex and could contain bugs. E.g., given 8 args: int0, int1, int2, int3, int4, int5, int_stack, xmm7.

callArgCount == 8 xpTopPos == 8 floatCount == 1 argRes == 8 intCount == 6 argRegs == 7

Then your following for loop will start from i = 8 and end at i == 2 (loop 7 iterations because of argRegs). You'll miss homing the 1st register int0.

Seems it could be simpler:

for (int i = min(callArgCount, XmmArgRegsCount); i > 0; i--) { if (had recording for i and it is float) { prep homing float reg } else if (i <= IntArgRegsCount) { prep homing int reg } ... }

1 ) callArgCount will be at least 9 -> you are missing the +call_info.
2 ) in your case i won't represent the the arg slot index..
3 ) good test case idea, will update the one on this PR.

callArgCount will be at least 9 -> you are missing the +call_info.

callArgCount is total native arg count, not original runtime arg count -- in my example "int1" is call_info.

in your case i won't represent the the arg slot index..

Why? i does represent the arg slot index.

if (had recording for i and it is float)
Why? i does represent the arg slot index.

As long as we loop reset the whole array each time and empty loop till first int register...

Anyways, updated the PR.

jianchun · 2017-03-24T17:06:24Z

lib/Backend/amd64/LowererMDArch.cpp

-    else if (isFloatArg && argPosition <= XmmArgRegsCount)
+    RegNum reg = GetRegFromArgPosition(isFloatArg, argPosition);
+#ifndef _WIN32
+    if (reg != RegNOREG && this->xplatCallArgs.IsRecording())


Seems all calls to GetArgSlotOpnd is to load an argument, and if we are to record every one, IsRecording() should be true.

jianchun · 2017-03-24T17:09:41Z

lib/Backend/amd64/LowererMDArch.h

+
+        inline int  GetArgsCount() { return intCount + floatCount; }
+
+        inline int  GetTopPosition() { return lastPosition; }


style nit: inline is assumed when implementation is in class declaration. We usually don't write inline here explicitly.

inline is assumed when implementation is in class declaration.

Sorry, how this implementation is different ?

I was just proposing removing inline keyword here.

Got that, could you please clarify why remove inline keyword here?

Personal preference, as I think it is redundant and gets in the way of code reading.

jianchun · 2017-03-24T17:11:20Z

lib/Backend/amd64/LowererMDArch.h

+            }
+        }
+
+        IRType args [XmmArgRegsCount + 1];


nit: Here it assumes XmmArgRegsCount >= IntArgRegsCount. Could use some code to guard that.

jianchun · 2017-03-24T17:13:43Z

test/AsmJs/rlexe.xml

+  </test>
+  <test>
+    <default>
+      <files>argTestMixed.js</files>


The new tests are fine, I'm just thinking maybe existing tests already covering this? The reason we didn't hit this was just because we were not running all existing tests (-simdjs issue). If you change runtests.py, remove that flag, maybe you can see the difference?

maybe but we don't have simd is enabled. These tests are now useful for xplat and will be useful for other archs.

jianchun · 2017-03-24T19:56:50Z

lib/Backend/amd64/LowererMDArch.h

+        {
+            Assert(isRecording);
+            isRecording = false;
+            for (int i = lastPosition - GetArgsCount(); i >= 0; i--) args[i] = TyMachPtr;


There are some assumptions hidden in this code. Looks this code assumes the recording pattern must be we always record every register in the back range but none in the front range if any is missing recording. That's why you clear range [0, start). Is it right? Is there a way to ensure this assumption holds? At least can you document it?

jianchun · 2017-03-24T22:35:22Z

lib/Backend/amd64/LowererMDArch.h

+        {
+            for (int i = 0; i <= XmmArgRegsCount; i++) args[i] = TyMachPtr;
+
+            floatCount = 0;


nit: floatCount tracked but not being used.

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* float reg. support

obastemur · 2017-03-25T12:08:02Z

@jianchun Thanks for the review. Made some additional improvements.

Merge pull request #2716 from obastemur:fix_xplat_args Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* support and a test case to make sure we don't break. This PR also improves performance on xplat jetstream Disabling couple of additional wasm tests on xplat with a note. edit: added another test case

@obastemur

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here.

@obastemur

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here.

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra.

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra. Edit: Specify type for -j test flag.

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra. Edit: Specify type for -j test flag. Edit: Disable asan for asmjs for the moment.

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra. Edit: Specify type for -j test flag. Edit: Disable tests that do not currently work with ASAN

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra. Edit: Specify type for -j test flag. Edit: Disable tests that do not currently work with ASAN

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra. Edit: Specify type for -j test flag. Edit: Disable tests that do not currently work with ASAN

@obastemur

Remove the flag-based test ignoring of -simdjs as it has caused confusion on whether or not tests were running. Add test tagging for simd, asmjs, and wasm functionality. Add the ability to read --sanitize flags and exclude known failing tests. Edit: Allow all of the asmjs tests, since they work now. @obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here. Edit: Add --static to test flags. Not using it currently to mask out any tests; this is just to make the test and build flags line up better. Edit: Add a -j flag to runtests.py for num processes This makes it line up better with the build flags for CI Edit: Add a flag to warn instead of fail on test timeout This is in an attempt to reduce noise a bit; a small number of tests fail to run quickly sometimes enough on one channel of our test infra. Edit: Specify type for -j test flag. Edit: Disable tests that do not currently work with ASAN Edit: Make the verbose flag do something, add a way to run tests outside of the repo if needed.

msftclas added the cla-already-signed label Mar 20, 2017

obastemur changed the base branch from master to release/2.0 March 20, 2017 15:01

obastemur force-pushed the fix_xplat_args branch from 191e9c3 to 08a8e9a Compare March 22, 2017 23:21

Penguinwizzard mentioned this pull request Mar 23, 2017

Numerous fixes to get ASAN working for CI #2730

Merged

jianchun reviewed Mar 24, 2017

View reviewed changes

obastemur force-pushed the fix_xplat_args branch from da9b793 to dffd23d Compare March 24, 2017 22:03

jianchun reviewed Mar 24, 2017

View reviewed changes

jianchun approved these changes Mar 24, 2017

View reviewed changes

obastemur force-pushed the fix_xplat_args branch from dffd23d to 028861d Compare March 25, 2017 09:34

obastemur added 2 commits March 25, 2017 10:35

xplat: use xmm for float args

3ff3b82

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* float reg. support

xplat: add test case for mixed args from registers

2c749f5

obastemur force-pushed the fix_xplat_args branch from 028861d to 2c749f5 Compare March 25, 2017 09:36

chakrabot merged commit 2c749f5 into chakra-core:release/2.0 Mar 25, 2017

obastemur mentioned this pull request Mar 27, 2017

AsmJs/WAsm calling convention issues on xplat #2378

Closed

Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 28, 2017

Allow all of the asmjs tests, since they work now.

059a250

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here.

Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 28, 2017

Allow all of the asmjs tests, since they work now.

4ceb42e

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and enabled them here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xplat: use xmm for float args #2716

xplat: use xmm for float args #2716

obastemur commented Mar 20, 2017 •

edited

Loading

obastemur commented Mar 21, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017 •

edited

Loading

obastemur Mar 24, 2017

jianchun Mar 24, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017

jianchun Mar 24, 2017

jianchun Mar 24, 2017

obastemur Mar 24, 2017

jianchun Mar 24, 2017

jianchun Mar 24, 2017

obastemur commented Mar 25, 2017


		inline int GetArgsCount() { return intCount + floatCount; }

		inline int GetTopPosition() { return lastPosition; }

xplat: use xmm for float args #2716

xplat: use xmm for float args #2716

Conversation

obastemur commented Mar 20, 2017 • edited Loading

obastemur commented Mar 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jianchun Mar 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

obastemur commented Mar 25, 2017

obastemur commented Mar 20, 2017 •

edited

Loading

jianchun Mar 24, 2017 •

edited

Loading