Fix missing SSE detection on x64 targets. Fixes #25 #26

cdwfs · 2017-08-16T19:14:20Z

One important note on this PR: with this change, matrix_benchmarks takes twice as long to run on my test system with SIMD enabled as it does with SIMD disabled. That seems... unintuitive. And worth investigating further.

googlebot · 2017-08-16T19:14:24Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

cdwfs · 2017-08-16T19:29:00Z

Re the CLA: I'm a Google employee. I just registered my GitHub account, though, so approval may still be processing.

googlebot · 2017-08-16T19:29:03Z

CLAs look good, thanks!

ghost · 2017-08-17T18:42:54Z

To be precise: It takes twice as long to run for x64, and we don't know how much slower it was before, since on x64 it wasn't using SIMD at all, right?

This change seems otherwise ok to me, it is not like anything in this PR is causing a slowdown by itself, so I think it should go in. Finding out the slowdown is a separate issue.

cdwfs · 2017-08-17T19:45:54Z

To be precise: It takes twice as long to run for x64, and we don't know how much slower it was before, since on x64 it wasn't using SIMD at all, right?

Not quite, no. You're correct that x64 was previously not using SIMD at all, whether it was enabled in the CMAKE options or not. What I meant was that now that x64 builds can use SIMD, enabling it actually slows down the matrix_benchmarks by a factor of two vs. the SIMD-disabled x64 configuration. My concern is that if this PR is merged as-is without identifying the cause of the slowdown, existing mathfu users who think they've had SIMD enabled all this time will suddenly see a pretty significant performance drop after integrating these changes.

johnb003 · 2018-01-21T17:36:09Z

I think the fpu <-> Mem <-> simd conversion is likely the biggest culprit, which can implicitly happen with the current implementation, and be tricky to spot. As for copy constructors and temp objects, the compiler does a pretty good job of dealing with this. https://en.wikipedia.org/wiki/Copy_elision

johnb003 · 2018-01-21T22:47:49Z

Oh, I just noticed you did indicate it was in the benchmark samples, that you saw the slowdown.
What system did you run the benchmarks on?

cdwfs · 2018-01-26T00:42:24Z

Lenovo ThinkPad P50 laptop.

Before applying this PR:

Running matrix benchmark ([no simd] [no padding])...
Took 107.934558 seconds

After applying this PR:

Running matrix benchmark ([simd] [padding])...
Took 218.015356 seconds

Fix missing SSE detection on x64 targets. Fixes google#25

16f8fa5

stewartmiles requested review from a user and haroonq August 16, 2017 19:37

cdwfs mentioned this pull request Jan 17, 2018

SIMD detection doesn't work on MSVC x64 targets #25

Open

rbsheth mentioned this pull request Dec 20, 2018

Hunterize v1.1.0 hunter-packages/mathfu#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing SSE detection on x64 targets. Fixes #25 #26

Fix missing SSE detection on x64 targets. Fixes #25 #26

cdwfs commented Aug 16, 2017

googlebot commented Aug 16, 2017

cdwfs commented Aug 16, 2017

googlebot commented Aug 16, 2017

ghost commented Aug 17, 2017

cdwfs commented Aug 17, 2017 •

edited

Loading

johnb003 commented Jan 21, 2018 •

edited

Loading

johnb003 commented Jan 21, 2018 •

edited

Loading

cdwfs commented Jan 26, 2018

Fix missing SSE detection on x64 targets. Fixes #25 #26

Are you sure you want to change the base?

Fix missing SSE detection on x64 targets. Fixes #25 #26

Conversation

cdwfs commented Aug 16, 2017

googlebot commented Aug 16, 2017

cdwfs commented Aug 16, 2017

googlebot commented Aug 16, 2017

ghost commented Aug 17, 2017

cdwfs commented Aug 17, 2017 • edited Loading

johnb003 commented Jan 21, 2018 • edited Loading

johnb003 commented Jan 21, 2018 • edited Loading

cdwfs commented Jan 26, 2018

cdwfs commented Aug 17, 2017 •

edited

Loading

johnb003 commented Jan 21, 2018 •

edited

Loading

johnb003 commented Jan 21, 2018 •

edited

Loading