-
Notifications
You must be signed in to change notification settings - Fork 659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve sample rounding and clean up noise shaping leftovers #771
Improve sample rounding and clean up noise shaping leftovers #771
Conversation
b724603: Absolute best rounding method is "round to even". Although this is the default and recommended rounding mode in IEEE 754 ("round to nearest, ties to even"), there are no LLVM intrinsics for it, and it introduces a new dependency. Compiling with |
fd4915a
to
b724603
Compare
Going to rebase and merge this, if there is any late opposition against the added 728 bytes of b724603 we can always revert later 😆 |
I don't think 728 bytes is going to make or break it for anyone,lol!!! A monolithic librespot binary compiled for ARMv6 with the ALSA backend is about 13.6 MB. |
@roderickvd I'm not sure what the deal is but this addition causes librespot to use about twice the CPU as before during normal use (just playing not downloading or caching). Going directly to hardware with the Edit: That's with |
For a frame of reference on my Pi 4 running 64bit Raspberry Pi OS it uses less than 10% CPU and on my Fedora x86_64 desktop it's about 0.5% CPU. So clearly not the end of the world just something to keep in the back of your mind. Edit: Diving a little deeper it's got to be the rounding. If I switch to |
@roderickvd I'm sorry if I'm talking your ear off but another thing I noticed is the use of |
That's not the same. Randomisation is what dithering is for (and does better). A stochastic rounding function was in the original dithering PR, but yanked because it behaves as a rectangular probability density function. That would be uniform, and it's too theoretical to explain but that causes intermodulation distortion. (If you're interested, read the publicly available works by Lipschitz and Wannamaker) The Gaussian or triangular ditherers take care of non-uniform, uncorrelated randomisation. Next step, you need to round instead of truncate. The best way to round is with the least error, and that is with an unbiased function (i.e. round half to even/uneven, also called convergent rounding). "Normal" rounding is biased in that it rounds away from zero. We can watch the CPU usage and revert to standard At this point I'm not too concerned and only expect CPU usage to drop with the 64-bit PR. |
Thanks for the explanation. I will take your word for it and skip the technical papers,lol!!! I was just curious.
Good deal. Rebase it and I'll give it a test. |
Try it, just merged it into #773. |
On my RPi 3B it seems about +5% |
That sounds about right considering it probably only used about 5% to begin with before. the Pi Zero is orders of magnitudes slower then even a Pi 3. As a frame of reference, it takes about 13 min to compile librespot on a Pi 4 and over 4 hours on a Pi Zero,lol!!! |
20 - 30% isn't a huge deal under normal playback conditions. My only concern is that CPU usage can spike momentarily to as high as in the 70's while caching a track. Another doubling of CPU usage and we hit 100% and audio starts to drop out. Is there a way to give the playback thread a higher priority or is that done already? |
I'm not sure but inclined to say no. But let's cross that bridge if and when we get there.
|
Fair enough. No need for premature optimization.
Well on a selfish note, I'm currently softening the wife up to let me buy another Pi 4 and a Topping D10s DAC for audio streaming so I may not care that much for much longer,lol!!! But generally I don't think it's a huge deal. It's not like a Pi Zero is a multi-tasking beast. Realistically you can only expect it to do one thing at a time well. My test set up is a stock install of Raspotify OS lite with nothing else running except the default services and librespot. I don't think that it's unreasonable to say that if you plan on running librespot on a Pi Zero that's all you're going to run on it. |
That's a great deal. You should come on Gitter if you want to talk gear and other stuff. |
So the verdict is that the 64 bit PR didn't help. Comparing 5d43b7d (the last build I had laying around) it still basically doubles. There was a cargo update in between so I can build right after that to make sure it's not a dependency update that's actually causing the problem? |
Nah I’m pretty sure this is it, I’m seeing it too. I’d like to arrive at a different conclusion though, namely that we can have 64-bit sample handling basically for free. |
It's for sure it. I replaced it with |
Two things:
I've been fighting with Rust over something as stupid as point 2 but am sure now:
Without
round()
this would all return0
.