Replies: 2 comments 7 replies
-
Thanks! I'm more than happy to have a look at that. This is something that's needed for FLAC. Indeed I did use the FLAC format specification you linked to, and generally it was a a good reference, but certainly there were a few details that lacked explanation. For the initial implementation of dr_flac, I think from memory I looked at another library for only one thing in particular and that was the coefficient constants for the SUBFRAME_FIXED decoding path. Just browsing the document quickly it looks like you've covered that nicely in section 10.2.5. One thing that screwed me over was the requirement to shift by an extra bit for side channels. It was such a small and unintuitive detail and it's never mentioned in the original specification. I think from memory it took something like 4 days of frustration to figure out what was going on there. It looks like you've got that covered as well in section 5.2. I think there were also details about the rice decoding and prediction calculation which took a bit of time to figure out in which case a more detailed explanation would have been handy. Other than that, I think most other issues were resolved with feedback from the community over time, just like that thing we talked about a while ago about the logic for choosing the 32- vs 64-bit path for the prediction phase (if you haven't talked about that in this new specification, I feel like that would be useful to mention in an appendix). In any case, I'll have a proper look at that new specification when I get a chance and give you my feedback here. Certainly by looking at it quickly just now it looks significantly better than the FLAC format specification at explaining the details. That said, I think the FLAC format specification is still a valuable resource because it does provide a good high level overview of the structure of the format and is really easy to read. |
Beta Was this translation helpful? Give feedback.
-
So I've gone ahead and written a simple FLAC decoder from this spec. I haven't yet implemented all features, most notably CRC checks, metadata and Ogg encapsulation, but it seems to decode the very few files I've tried. It just skips over metadata. You can find it here if you're curious (see the header for usage and comments): https://github.com/mackron/ref-flac Overall the process was very nice. Much easier than the FLAC format specification document. In particular I really liked the explanation of the side channel stuff and the requirement for the extra bit. Probably took about 2-3 days worth of work. I tried to build it straight from the spec without using any outside sources. I just pulled the bit streaming system from dr_flac in addition to some simple utility functions. I didn't look at dr_flac at all for any implementation details of the actual FLAC stuff. I only have a few comments, most of which I think is fairly minor. The main thing I didn't like was the explanation for residual decoding. The section for decoding residual says this:
In my opinion that is too vague. The way the numbers are encoded is mentioned in the paragraph above, and I think the expectation is for the reader just do the reverse for the decoding stage, but I think it's unnecessarily hard to wrap your head around. I think an explicit explanation on how to do the rice and zigzag decoding would be nice. This is the comment I wrote in my code for my own reference (not suggesting you use that exact language, just the kind of explanation that I think would be more useful): https://github.com/mackron/ref-flac/blob/a7836f456680b539a5f9a229f2c96fe43a937286/rflac.c#L1356-L1373. In the end I ended up having to look up a reference to refresh my memory on Rice coding. Another thing I noticed concerned the fixed subframe type. Section 10.2.5 says "first the residual has to be decoded, after which for each sample the prediction can be added", but really the first thing that has to be decoded are the warmup samples. I understand that the intention of that statement is to explain the overall concept, but I find it just slightly confusing and misleading. The warmup samples are indeed mentioned in a paragraph below, and I did know this from prior experience, but it's not quite as explicit as I think it could be. A table like in section 10.2.6 might be useful. Something that looks like this:
The same kind of table exists for the LPC path, and in my opinion I think it'd make it just a bit clearer in the fixed section as well. One other quick and minor thing is that I had to look up a reference for UTF-8 encoding and unary numbers to refresh my memory. This is only a very minor thing because it's easy to find references online, but I would have liked it to just be right there in the spec so I could just implement it then and there. Also, since the UTF-8 encoding used in FLAC isn't technically standard UTF-8 because of how it's extended to 7 bytes, being explicit with that could be a bit safer in my opinion. But overall this was a really good reference! There's still a few things I might review later. I haven't yet done any CRC stuff so I might implement that at some point to get a better feel for how well it's explained in the spec. I also haven't properly reviewed the metadata stuff yet. When I get to that stuff I'll post a comment here. |
Beta Was this translation helpful? Give feedback.
-
Hi @mackron,
Sorry for bothering you this way, I really hope you don't mind.
To implement FLAC decoding in dr_flac, you might have used the FLAC format specification on the FLAC website. If so, you probably needed to dive into some actual code (libFLAC or ffmpeg) to understand the finer points. An IETF working group has tried to improve this document, trying to include those finer points, and is on the verge of turning it in for final checks and publication as an RFC. A so-called working group last call has been issued, which lasts 2 weeks.
As you've had first-hand experience implementing FLAC from scratch, you probably know best what implementation details needed extra attention. It would be very valuable to have your feedback on the FLAC format for this document.
I hope you can provide some insights.
The latest draft can be found here https://datatracker.ietf.org/doc/draft-ietf-cellar-flac/
Beta Was this translation helpful? Give feedback.
All reactions