-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Replace FFmpeg with GStreamer #161
Conversation
a0664f9
to
8954444
Compare
Did a bit more work and it actually seems like it might be the decoder's fault. For laughs I went and submitted the entire buffer at once and attempted to pull samples from that, and it continued to send blocks incorrectly in the exact same way, so the way we submit data has absolutely no relevance, the decoder simply refuses to present us with the right number of samples. This is a problem because the dpds table actually matters a lot, we can't just decode linearly and expect everything to be safe, in rare cases we have to go backward in time and we also have to concern ourselves with play regions, so lazily decoding is not an option. |
OOC, which decoder are you actually using here? The Fluendo WMA decoder or the ffmpeg one? The latter should literally give you the same amount of samples per packet as the previous ffmpeg code did (unless your conversion actually causes resampling). |
I’ve tested both Fluendo and gst-libav, both produce different results. We know the latter will be wrong per #95, but I would expect a more official codec to be accurate... I’ll see what their team has to say. The existing FFmpeg decoder ignores the block size problem to sound mostly correct, at the expense of numerous issues that are potentially unsafe to the user should the decoder veer off too far from the intended result. |
5ee7a3a
to
0a05b65
Compare
Not that it's going to help you, but with gst-libav you should get exactly the same results in this regard as with using ffmpeg directly. There's no additional padding, buffering, clipping, etc. between the calls to ffmpeg and the GStreamer layer here. I can't speak for the Fluendo decoder but I would assume it to be based on the official Microsoft SDK.
So if I understand you correctly, what you must be able to do is:
And I assume it's not enough for you to only get the correct number of samples overall for the whole stream. I think in that case your only option is to directly use a single WMA decoder library (and no abstraction like ffmpeg or GStreamer on top of it) directly, as only then you can actually guarantee these points. Otherwise you have no control over the actual decoder implementation and it can behave different as it has different requirements than you. |
A direct decoder would be great, but I’m unaware of anything like that other than the MS SDK, which is private. I definitely would have gone for that from the start, similar to how we use stb_vorbis for that format. FFmpeg comes really close to doing what we want, it just isn’t accurate enough. If it was, we would be able to get the right size since we only submit one packet at a time and it doesn’t keep samples from us. |
Any progress on this? |
Can you tell that there are zero examples for bytes -> decoder -> bytes?
0a05b65
to
e3a36c8
Compare
If someone wants to test this pr on arch, i made packages (rename to .tar.xz): |
Sound in fallout 4 not working, log with gst_debug=4 |
Do not use this patch unless you are using a custom GStreamer codec that does not exist. The existing decoders will not work correctly with this system. |
Ok, get it. |
Closing in favor of #192, which includes this patch. |
I've written a new WMA decoder that uses GStreamer instead of FFmpeg directly. There are a few reasons for this:
The current revision mostly works, but similar to #95 it seems like GStreamer (or possibly the decoder?) is withholding samples, even though each packet should be entirely self-sufficient without the need for information from neighboring packets. You can see the difference by setting
REMOVE_ME
to 1 inFAudio_gstreamer.c
; the intended size should always be exactly the same as the actual size, and yet it isn't most of the time! If there was a way to force a full decode of the block and flush the entire decoded stream every time a packet was sent, it would most likely fix this problem.This can be tested by setting
GSTREAMER=ON BUILD_UTILS=ON
in the CMake configuration and runningtestxwma
with this test file. GStreamer debugging information can be found here.