Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make math-mode=fast disabled, it's too dangerous #41607

Closed
wants to merge 2 commits into from

Conversation

PallHaraldsson
Copy link
Contributor

@PallHaraldsson PallHaraldsson commented Jul 15, 2021

Given #41592 (comment) it's just too dangerous, and tempting for newbies to see. Still not dropped entirely, if this works should be backported to 1.7.

Fixes: #25028

Also note, the title of the former linked issue is wrong, with "fast-mode" incorrect, I just couldn't change the title.

@JeffBezanson
Copy link
Member

This doesn't feel like a healthy way to deal with a problem :)

@PallHaraldsson PallHaraldsson changed the title Make math_mode undocumented, it's too dangerous Make math-mode undocumented, it's too dangerous Jul 15, 2021
@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Jul 15, 2021

I'm ok to with getting rid of the global option entirely. It's just what I had time for now, can't dig into removing the code. It could be done in 1.7.x? If it's not documented, people shouldn't use... If 1.7 will be LTS, I wouldn't want to see it there. It should always be safe to reintroduce the option, if you change your mind.

@PallHaraldsson PallHaraldsson changed the title Make math-mode undocumented, it's too dangerous Make math-mode=fast disabled, it's too dangerous Jul 15, 2021
@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Jul 15, 2021

I actually meant to only make undocumented, but I'm not sure (didn't test the code, just made a quick edit on Github), I actually think I disabled the option, so then that's even better.

@StefanKarpinski on the macro: "We could make it a no-op pre 2.0 though since code would still work." That's not addressed here, only global option removed disabled. Maybe should be rather turned into noop (and kept undocumented). What would deprecation mean for the global option, other than that? It seems as good to disable in 1.x, shouldn't be considered a breaking change, while PR as is would error on it.

@jakobnissen
Copy link
Member

Ref #36246

@vchuravy
Copy link
Member

I am all for removing the global option. Randomly changing the math mode of libraries is a "bad idea"(tm)

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Jul 16, 2021

@oscardssmith I like that you're fixing #41600 but it seems like a waste of an excellent brain, yours (and others, two reviewers, I also see failing tests there). Isn't it best to kill this bug with a sledgehammer, i.e. for all such cases? That are just not in the standard library, but would apply to all code in packages too.

I was thinking of newbies with this PR seeing the tempting "fast" option, but really it shouldn't be in the hands of any user. For C/C++ the option makes more sense as it's in the hands of developers, and separately compiled libraries mitigate the problem there. Curious, how much performance gain is there to the fast mode, best case? No gain seems worthwhile as this PR isn't taking away the macro.

About your "Performance impact is minimal (max 1 cycle regression)" was that vs. otherwise with the fast option, not vs ieee?

Seems the failing tests are false alarms.

@chriselrod
Copy link
Contributor

chriselrod commented Jul 17, 2021

Curious, how much performance gain is there to the fast mode, best case?

IMO, if performance is gained somewhere, then users should locally change that to use muladd and/or @simd (or maybe even @fastmath) so that it no longer benefits from --math-mode=fast.

I'm sure there are lots of places that would gain, e.g. when you mix ForwardDiff and StaticArrays, but you'd have to run benchmarks. StaticArrays by itself at least now generally uses muladd, but when you use both, you're likely to have a lot of multiply-by/add zeros that could be eliminated with fast math.
I'll try and get around to running a few benchmarks. (But the solution there is to update ForwardDiff/StaticArrays, not improve --math-mode=fast IMO)

@antoine-levitt
Copy link
Contributor

Is that really the case? For library code, it's not always clear whether users will want strict floating point semantics or not, is it?

@chriselrod
Copy link
Contributor

chriselrod commented Jul 17, 2021

Is that really the case? For library code, it's not always clear whether users will want strict floating point semantics or not, is it?

For internal algorithms, users aren't exposed to the particular way you wrote the code. The authors will know whether correctness depends on strict floating point semantics.

For libraries like ForwardDiff differentiating your code, I think they should respect whatever your code was doing. E.g., the derivative rule for * should use *, and for Base.FastMath.mul_fast should use Base.FastMath.mul_fast.

@StefanKarpinski
Copy link
Member

I do agree that this option is so dangerous as to be effectively useless — I literally cannot think of a scenario where it's a good idea to use this option for Julia. The reason the -ffast-math option to clang and gcc isn't as dangerous is that it applies only to a single compilation unit at a time, so libraries are each compiled with it or not according to whether they need strict IEEE semantics or not. There is no clear notion of a compilation unit in Julia and this doesn't apply to compilation units anyway: it applies to everything, which is why it's so bad.

@chriselrod
Copy link
Contributor

chriselrod commented Jul 17, 2021

The one thing it helps with is being able to test things like this:

I'm sure there are lots of places that would gain, e.g. when you mix ForwardDiff and StaticArrays, but you'd have to run benchmarks

Starting Julia normally:

julia> @time using StaticArrays, ForwardDiff
  0.733407 seconds (3.17 M allocations: 231.400 MiB)

julia> f(x) = prod(abs2, x)
f (generic function with 1 method)

julia> g(x) = @. (1.2 + 0.3x) / 0.8
g (generic function with 1 method)

julia> h(x) = f(g(x))
h (generic function with 1 method)

julia> x = @SVector rand(3);

julia> @benchmark ForwardDiff.gradient(h, $(Ref(x))[])
BenchmarkTools.Trial: 10000 samples with 995 evaluations.
 Range (min  max):  26.547 ns  46.164 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     26.592 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   26.627 ns ±  0.442 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▇█▃                                                       ▂
  █████▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▅▄▇█▇███ █
  26.5 ns      Histogram: log(frequency) by time      27.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.hessian(h, $(Ref(x))[])
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
 Range (min  max):  70.335 ns  109.864 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     71.298 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   71.361 ns ±   0.691 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                    ▂▁ ▇▁▅▇█▃▁▆
  ▂▁▁▁▁▂▂▂▂▂▂▂▂▂▄▆▄▆███████████▇▄█▂▂▃▂▂▂▁▁▂▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▂▂ ▃
  70.3 ns         Histogram: frequency by time         72.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

With --math-mode=fast:

julia> @time using StaticArrays, ForwardDiff
  0.752235 seconds (3.17 M allocations: 231.408 MiB, 2.81% gc time)

julia> f(x) = prod(abs2, x)
f (generic function with 1 method)

julia> g(x) = @. (1.2 + 0.3x) / 0.8
g (generic function with 1 method)

julia> h(x) = f(g(x))
h (generic function with 1 method)

julia> x = @SVector rand(3);

julia> @benchmark ForwardDiff.gradient(h, $(Ref(x))[])
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
 Range (min  max):  21.576 ns  52.098 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     21.689 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   21.701 ns ±  0.535 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃▆▆  ▆█▆                                                    ▂
  █████████▃▁▁▁▁▁▁▃▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▅▅▆▆▆▆▇█ █
  21.6 ns      Histogram: log(frequency) by time      22.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.hessian(h, $(Ref(x))[])
BenchmarkTools.Trial: 10000 samples with 993 evaluations.
 Range (min  max):  33.842 ns  106.685 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     34.103 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   34.160 ns ±   1.010 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▁▃▃▂   ▃▇█▇▅▂                                          ▁    ▂
  ▅████▆▁▄██████▆▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▄▆▄▄▅▆▇████▇█ █
  33.8 ns       Histogram: log(frequency) by time      35.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Starting normally again, but now using @fastmath:

julia> @time using StaticArrays, ForwardDiff
  0.736197 seconds (3.17 M allocations: 231.400 MiB)

julia> f(x) = prod(Base.FastMath.abs2_fast, x)
f (generic function with 1 method)

julia> g(x) = @fastmath @. (1.2 + 0.3x) / 0.8
g (generic function with 1 method)

julia> @macroexpand  @fastmath @. (1.2 + 0.3x) / 0.8
:(Base.FastMath.div_fast.(Base.FastMath.add_fast.(1.2, Base.FastMath.mul_fast.(0.3, x)), 0.8))

julia> h(x) = f(g(x))
h (generic function with 1 method)

julia> x = @SVector rand(3);

julia> @benchmark ForwardDiff.gradient(h, $(Ref(x))[])
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min  max):  16.648 ns  32.662 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     16.663 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   16.692 ns ±  0.361 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▆                                                          ▁
  ██▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▅▇▇▆▇ █
  16.6 ns      Histogram: log(frequency) by time      17.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark ForwardDiff.hessian(h, $(Ref(x))[])
BenchmarkTools.Trial: 10000 samples with 755 evaluations.
 Range (min  max):  167.307 ns  224.074 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     167.736 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   167.943 ns ±   1.036 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

          █▆▂▇▅
  ▂▂▂▂▂▂▃▆█████▄▂▂▂▁▁▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂ ▃
  167 ns           Histogram: frequency by time          170 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

So it does at least let us test whether we can improve performance in certain microbenchmarks by making a few changes to certain libraries (i.e., making sure Base.FastMath.mul_fast(::ForwardDiff.Dual, ...) inlines) so that the @fastmath version of ForwardDiff.hessian above can match the --math-mode=fast ForwardDiff.hessian.

@StefanKarpinski
Copy link
Member

There's no question that --math-mode=fast can make things faster. But there's no way to use it safely because the option isn't scoped in any way, unlike the similar -ffast-math option for gcc and clang which is scoped to the compilation unit. When using --math-mode=fast how can you know the results aren't very quickly computed garbage?

@chriselrod
Copy link
Contributor

chriselrod commented Jul 17, 2021

There's no question that --math-mode=fast can make things faster. But there's no way to use it safely because the option isn't scoped in any way, unlike the similar -ffast-math option for gcc and clang which is scoped to the compilation unit. When using --math-mode=fast how can you know the results aren't very quickly computed garbage?

I agree, and I'm fine with removing it.
I haven't used --math-mode=fast often for testing what the expected returns might be of making a change to a library.
Seeing a 5% improvement might not be enough to motivate someone to make the effort, but showing a benchmark taking half the time shows it's probably worth pursuing (i.e., making a PR to add/change the necessary diff rules).

That is, I think it's potentially useful in saying that maybe optimizations are possible. But you'd better double check that it's actually true -- maybe it's fast because the results are quickly computed garbage.
I agree that no one should be trust running real code under it for anything other than that sort of exploratory purpose.

@antoine-levitt
Copy link
Contributor

Renaming to --math-mode=broken and documenting that it essentially should never be used for anything serious could be sufficient, then. I do feel it could be useful as a "unsafe-by-default" variant that would require any floating-point sensitive code to use @nofastmath, but yeah it's probably better to just @fastmath the places that could really benefit. In any case as @chriselrod points out, it's useful as a diagnostic, just like --check-bounds.

The reason the -ffast-math option to clang and gcc isn't as dangerous is that it applies only to a single compilation unit at a time, so libraries are each compiled with it or not according to whether they need strict IEEE semantics or not. There is no clear notion of a compilation unit in Julia and this doesn't apply to compilation units anyway: it applies to everything, which is why it's so bad.

Aren't modules the closest equivalent? @fastmath could meaningfully be a module-level macro like @optlevel.

@KristofferC
Copy link
Member

The problem is not @fastmath which is already scoped, it is the global command line option.

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Jul 18, 2021

Right, so what to do with the PR, just merge?

I'm not a huge fan of julia --help showing "broken" option. I would like "fast" gone there (it could be a no op rather then an error, but I'm not sure hugely important if few use the option). "Broken" (or "debug") or whatever we come up with could be added in separate PR, even in 1.7.x, possibly undocumented or under --help-hidden.

@KristofferC
Copy link
Member

Right, so what to do with the PR, just merge?

It doesn't seem there is a really consensus on the exact way to deal with --math-mode=fast and it cannot really just be merged since it is breaking.

Also, if the option is removed, all the code that handles that option should be removed with it.

PallHaraldsson added a commit to PallHaraldsson/julia that referenced this pull request Jul 19, 2021
@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Jul 19, 2021
@PallHaraldsson
Copy link
Contributor Author

Bump. Since 1.7 seems close, this just needs a decision.

@KristofferC
Copy link
Member

How is this related to 1.7?

@StefanKarpinski
Copy link
Member

The 1.7 release is on its fourth beta, it's way too late for features.

@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Aug 6, 2021

Ok, I was thinking, it might be an exception as a trivial dropping of an anti-feature, which is arguably a bug (in case 1.7 will be an LTS, best to drop in 1.7.0 so same "user API" for all releases after). Even just dropping from the julia --help would be an improvement (and docs).

It would be trivial to add back if people complain for 1.7-rc (or in 1.7.x). My other PR #41638 (comment) changes without breaking scripts they might use the option.

@StefanKarpinski
Copy link
Member

This has been the status quo for over three years, there's zero urgency about deleting this feature.

@JeffBezanson
Copy link
Member

From triage: we need to keep the option (to avoid breaking anything), but it should do nothing and we can remove the code for it. Can be marked as deprecated.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Aug 12, 2021
@PallHaraldsson
Copy link
Contributor Author

PallHaraldsson commented Oct 19, 2021

You can reopen #41638 (as replacement PR and close this one)

@ViralBShah ViralBShah marked this pull request as draft March 12, 2022 22:54
@KristofferC
Copy link
Member

Obsolete by #41638

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

deprecate global --math-mode=fast flag?
8 participants