Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xplat: use xmm for float args #2716

Merged
merged 2 commits into from
Mar 25, 2017

Conversation

obastemur
Copy link
Collaborator

@obastemur obastemur commented Mar 20, 2017

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* support and a test case to make sure we don't break.

This PR also improves performance on xplat jetstream

Disabling couple of additional wasm tests on xplat with a note.

edit: added another test case

@obastemur obastemur changed the base branch from master to release/2.0 March 20, 2017 15:01
@obastemur
Copy link
Collaborator Author

@dotnet-bot test Windows ci_slow_x64_debug please

@@ -523,7 +561,9 @@ LowererMDArch::LowerCallArgs(IR::Instr *callInstr, ushort callFlags, Js::ArgSlot
callInstr->InsertBefore(argInstr);
argCount++;
}

#ifndef _WIN32
this->xplatCallArgs.StopRecording();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty neat idea 👍 I'm worried about the locations of Start/Stop recording though. It seems to me maybe every GetArgSlotOpnd call needs to be recorded. We use a mixture of LowerCallArgs and several other random code paths to load different args, which may all need to be homed. E.g., even the helperCallArgs loaded in LowerCall may need to be homed. Stopping here may miss some args. Please survey all calls to GetArgSlotOpnd to see if this is a potential issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we count them all, we would also count the ones we endup pushing to stack. We are only interested with the ones explicitly located on registers (float and non-float) . The rest, we consider them as non-float.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even the helperCallArgs loaded in LowerCall may need to be homed.

Indeed they are but we don't need to check/store their type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we only record register args, which is what you currently doing inside GetArgSlotOpnd. You are not recording stack args.

Recording all might be more clear, otherwise you have to assume this is the only place xmm args could be loaded, all others in int registers, other registers may have no recordings, making the homing code complicated... (If recording all is hard to do right, this is fine with me)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you follow the usages of GetArgSlotOpnd you will see that there are 2 places setting argSym so the register can be anything but TyMachReg. One of them we record. Other, we don't have to since shouldHomeParams = argCount > 0 and !shouldHomeParams... So we record them all right.

int floatCount = this->xplatCallArgs.floatCount;
int argRes = min(callArgCount, static_cast<int>(XmmArgRegsCount));
int intCount = max(0, min(argRes - floatCount, static_cast<int>(IntArgRegsCount)));
int argRegs = intCount + floatCount;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the logic here -- could it be due to the possibility that you were not recording all args loading (my other comment)? If you recorded every arg loading, the logic here could be very simple. Every arg from position 0 to xplatCallArgs.topPosition must be in register, type recorded, and must be homed. You can then figure out the register from position and your recorded type and home it. Seems no need to count int/float registers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to limit recording to actual registers. This way we know / store only the ones make sense for our homing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you can guarantee that the partial recording successfully records every float register arg, the logic here seems quite complex and could contain bugs. E.g., given 8 args: int0, int1, int2, int3, int4, int5, int_stack, xmm7.

callArgCount == 8
xpTopPos == 8
floatCount == 1
argRes == 8
intCount == 6
argRegs == 7

Then your following for loop will start from i = 8 and end at i == 2 (loop 7 iterations because of argRegs). You'll miss homing the 1st register int0.

Seems it could be simpler:

for (int i = min(callArgCount, XmmArgRegsCount); i > 0; i--) {
  if (had recording for i and it is float) {
    prep homing float reg
  } else if (i <= IntArgRegsCount) {
    prep homing int reg
  }
  ...
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 ) callArgCount will be at least 9 -> you are missing the +call_info.
2 ) in your case i won't represent the the arg slot index..
3 ) good test case idea, will update the one on this PR.

Copy link

@jianchun jianchun Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callArgCount will be at least 9 -> you are missing the +call_info.

callArgCount is total native arg count, not original runtime arg count -- in my example "int1" is call_info.

in your case i won't represent the the arg slot index..

Why? i does represent the arg slot index.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (had recording for i and it is float)
Why? i does represent the arg slot index.

As long as we loop reset the whole array each time and empty loop till first int register...

Anyways, updated the PR.

else if (isFloatArg && argPosition <= XmmArgRegsCount)
RegNum reg = GetRegFromArgPosition(isFloatArg, argPosition);
#ifndef _WIN32
if (reg != RegNOREG && this->xplatCallArgs.IsRecording())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems all calls to GetArgSlotOpnd is to load an argument, and if we are to record every one, IsRecording() should be true.


inline int GetArgsCount() { return intCount + floatCount; }

inline int GetTopPosition() { return lastPosition; }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: inline is assumed when implementation is in class declaration. We usually don't write inline here explicitly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline is assumed when implementation is in class declaration.

Sorry, how this implementation is different ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just proposing removing inline keyword here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got that, could you please clarify why remove inline keyword here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference, as I think it is redundant and gets in the way of code reading.

}
}

IRType args [XmmArgRegsCount + 1];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Here it assumes XmmArgRegsCount >= IntArgRegsCount. Could use some code to guard that.

</test>
<test>
<default>
<files>argTestMixed.js</files>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests are fine, I'm just thinking maybe existing tests already covering this? The reason we didn't hit this was just because we were not running all existing tests (-simdjs issue). If you change runtests.py, remove that flag, maybe you can see the difference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe but we don't have simd is enabled. These tests are now useful for xplat and will be useful for other archs.

{
Assert(isRecording);
isRecording = false;
for (int i = lastPosition - GetArgsCount(); i >= 0; i--) args[i] = TyMachPtr;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some assumptions hidden in this code. Looks this code assumes the recording pattern must be we always record every register in the back range but none in the front range if any is missing recording. That's why you clear range [0, start). Is it right? Is there a way to ensure this assumption holds? At least can you document it?

{
for (int i = 0; i <= XmmArgRegsCount; i++) args[i] = TyMachPtr;

floatCount = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: floatCount tracked but not being used.

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding
XMM* float reg. support
@obastemur
Copy link
Collaborator Author

@jianchun Thanks for the review. Made some additional improvements.

@chakrabot chakrabot merged commit 2c749f5 into chakra-core:release/2.0 Mar 25, 2017
chakrabot pushed a commit that referenced this pull request Mar 25, 2017
Merge pull request #2716 from obastemur:fix_xplat_args

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* support and a test case to make sure we don't break.

This PR also improves performance on xplat jetstream

Disabling couple of additional wasm tests on xplat with a note.

edit: added another test case
chakrabot pushed a commit that referenced this pull request Mar 27, 2017
Merge pull request #2716 from obastemur:fix_xplat_args

Previously xplat LowerCall args were limited to REG_INT_ARG... Adding XMM* support and a test case to make sure we don't break.

This PR also improves performance on xplat jetstream

Disabling couple of additional wasm tests on xplat with a note.

edit: added another test case
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 28, 2017
@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 28, 2017
@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 28, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 28, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 29, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.

Edit: Specify type for -j test flag.
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 29, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.

Edit: Specify type for -j test flag.

Edit: Disable asan for asmjs for the moment.
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Mar 31, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.

Edit: Specify type for -j test flag.

Edit: Disable tests that do not currently work with ASAN
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Apr 1, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.

Edit: Specify type for -j test flag.

Edit: Disable tests that do not currently work with ASAN
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Apr 5, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.

Edit: Specify type for -j test flag.

Edit: Disable tests that do not currently work with ASAN
Penguinwizzard added a commit to Penguinwizzard/ChakraCore that referenced this pull request Apr 6, 2017
Remove the flag-based test ignoring of -simdjs as it has
caused confusion on whether or not tests were running.
Add test tagging for simd, asmjs, and wasm functionality.
Add the ability to read --sanitize flags and exclude known
failing tests.

Edit: Allow all of the asmjs tests, since they work now.

@obastemur fixed the issues with them in chakra-core#2716, so I've gone and
enabled them here.

Edit: Add --static to test flags.

Not using it currently to mask out any tests; this is just to make
the test and build flags line up better.

Edit: Add a -j flag to runtests.py for num processes

This makes it line up better with the build flags for CI

Edit: Add a flag to warn instead of fail on test timeout

This is in an attempt to reduce noise a bit; a small number of tests
fail to run quickly sometimes enough on one channel of our test infra.

Edit: Specify type for -j test flag.

Edit: Disable tests that do not currently work with ASAN

Edit: Make the verbose flag do something, add a way to run tests outside
of the repo if needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants