Skip to content
This repository has been archived by the owner on Apr 3, 2020. It is now read-only.

Commit

Permalink
Only use custom SSE FMUL and FMAC with non-clang compilers.
Browse files Browse the repository at this point in the history
clang's auto-vectorized C version performs better according to the
Chrome Performance Dashboard.  Searching back through the logs, this
occurred when we switched over to clang by default.

We could try to microoptimize further, but it's less of a maintenance
burden to just let the compiler do its thing!

The main reason the clang version is faster is it does 2x 128bit
operations per loop. Simply copying these optimization yields ~97%
similar performance, but  the SIMD code a bit gnarlier. As such I
choose to simply use the C variant when clang is present.

BUG=none
TEST=none

Review URL: https://codereview.chromium.org/599693002

Cr-Commit-Position: refs/heads/master@{#297268}
  • Loading branch information
dalecurtis authored and Commit bot committed Sep 29, 2014
1 parent b91e786 commit 6570731
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions media/base/vector_math.cc
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,15 @@
// NaCl does not allow intrinsics.
#if defined(ARCH_CPU_X86_FAMILY) && !defined(OS_NACL)
#include <xmmintrin.h>
// Don't use custom SSE versions where the auto-vectorized C version performs
// better, which is anywhere clang is used.
#if !defined(__clang__)
#define FMAC_FUNC FMAC_SSE
#define FMUL_FUNC FMUL_SSE
#else
#define FMAC_FUNC FMAC_C
#define FMUL_FUNC FMUL_C
#endif
#define EWMAAndMaxPower_FUNC EWMAAndMaxPower_SSE
#elif defined(ARCH_CPU_ARM_FAMILY) && defined(USE_NEON)
#include <arm_neon.h>
Expand Down

0 comments on commit 6570731

Please sign in to comment.