-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reintroduce fast math functions #7495
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good.
@messmerd Having thought of it a bit, how about we just do the raw |
0 chance, because the code already there uses
I have another idea... using fma instead of calling fma (explained below). But I have no strong opinion either way, @DomClark @messmerd A quick check of Dom's findings show he's correct, when using that However, as messmerd mentioned on Discord, FMA has different precision behavior than the standard multiply-add, and I found it odd that the compiler would choose to optimize something in a way that objectively changes its behavior. As it turns out, GCC has (I don't recommend this ↓) (I recommend this, if possible ↓) IF all of the hardware we target supports FMA. I don't know whether this is the case. If this isn't the case, then we can still allow it to potentially be enabled for compiled builds, but it would have to be disabled for releases on the website or else the program simply wouldn't run on that hardware. In any case, unless we take the aforementioned manual FMA approach with I don't know a lot about this topic, so hopefully I'm not being completely glaringly wrong on every front here in some way I'm not noticing. That would be embarrassing. |
Thanks for the clarification lost, now another question, would it make sense to change other calls to |
Probably not, only if you really know what you're doing. I recommend not bothering with replacing past pow uses except in cases where its usage is causing a noticeable performance detriment. Do not use it in cases where the pow base is known at compile time (e.g. pow(2,x) or pow(10,x)), those can be optimized in significantly better and more accurate ways than a general-purpose pow approximation. I'll eventually make a PR to optimize LMMS's dBFS/amplitude conversion functions specifically. |
On GCC with -O1 or -O2 optimizations, this new implementation generates identical assembly to the old union-based implementation
@LostRobotMusic you did say you use std::pow in LOMM right? Wanna replace? |
No, I said I use the dBFS/amplitude conversion functions. In my last message I said I'll make a PR to optimize those. |
Regarding LOMM, I remember you said somewhere in discord. Per : https://discord.com/channels/203559236729438208/784594576223633478/1286903926653714433 I dug out the exact chat this time to avoid flak. |
LOMM uses LMMS's dbfsToAmp function, and LMMS's dbfsToAmp function uses std::pow. std::pow is not directly used anywhere in LOMM's code, and as I said, I'll eventually make a PR to optimize LMMS's dBFS/amplitude conversion functions specifically. |
Okk, i misunderstood that point. |
* Add fast fma functions * Use fast fma functions * Add fast pow function * Use fast pow function * Fix build * Remove fastFma * Avoid UB in fastPow On GCC with -O1 or -O2 optimizations, this new implementation generates identical assembly to the old union-based implementation
Added the
fastPow
function back intolmms_math.h
after it was mistakenly removed in #7382, and removed the undefined behavior caused by type punning with a union.Replaced calls to
std::fma
withx * y + z
(see discussion in review comments below).I left the fast
sqrt
function out because it was unused.