Skip to content

Commit

Permalink
Fix fast_exp2 with MSVC and sse4.2 (AcademySoftwareFoundation#3804)
Browse files Browse the repository at this point in the history
I ran into the problem described in AcademySoftwareFoundation#2929 and did a little digging to
see if I could find the source of the bad color conversions. I tracked
it down to `fast_exp2()`
https://github.com/OpenImageIO/oiio/blob/a77e36250fadfb4c391688eb1df4b3dd88d5e367/src/include/OpenImageIO/fmath.h#L2050-L2080

and in particular the return statement on L2071. In the process of
trying to figure out what part of that return statement was going awry,
I found that separating the bitcasts to separate variables as in this
PR, the problem went away. All testing was done with the `color_test`
added in AcademySoftwareFoundation#3023.

It also succeeded when adding various `std::cout` lines to check what
the values of each computation were, so it does sound like a compiler
optimization that does or does not get triggered is the problem.

Unfortunately I don't know why it's not failing on the CI builds, I
double checked to make sure I'm using the same settings and flags. I
also tested on two different machines and get the same failures, and
success with this patch, on both.

One strange wrinkle is that even without this patch, color conversions
worked fine when using AVX2 for the SIMD mode.

Summary :
Previous results (without this patch)

- USE_SIMD=0 - PASS
- USE_SIMD=SSE4.2 - FAIL
- USE_SIMD=AVX - FAIL
- USE_SIMD=AVX2 - PASS

With this patch
- USE_SIMD=0 - PASS
- USE_SIMD=SSE4.2 - PASS
- USE_SIMD=AVX - PASS
- USE_SIMD=AVX2 - PASS

Visual Studio 2019 versions 16.11.32228.343, 16.11.33328.57 and
16.11.33423.256(same as current Github build runners) in RelWithDebInfo
mode.
  • Loading branch information
ericmehl authored and lgritz committed Apr 15, 2023
1 parent 5187a52 commit 8ec4a8d
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion src/include/OpenImageIO/fmath.h
Original file line number Diff line number Diff line change
Expand Up @@ -2069,7 +2069,19 @@ OIIO_FORCEINLINE OIIO_HOSTDEVICE T fast_exp2 (const T& xval) {
r = madd(x, r, kD);
r = madd(x, r, kE);
r = madd(x, r, one);
return bitcast_to_float (bitcast_to_int(r) + (m << 23));

// Original code was:
// return bitcast_to_float (bitcast_to_int(r) + (m << 23));
// This was producing the wrong results when called via `sRGB_to_linear()`
// and `linear_to_sRGB()` on some Windows MSVC builds.
// We found that it was fixed by using a temporary intN, as below.
// Presumed to be an optimizer bug in MSVC versions 16.11.x
// and earlier.
// See PR #3804

intN i = bitcast_to_int(r);
T f = bitcast_to_float(i + (m << 23));
return f;
#else
T r;
for (int i = 0; i < r.elements; ++i)
Expand Down

0 comments on commit 8ec4a8d

Please sign in to comment.