Using lz4_flex to decompress a stream of data that arrives in random chunks #177

billyc · 2024-12-11T15:30:09Z

Hi, newbie to Rust here, but very excited so far =) and I am a bit unsure how to accomplish what I want with this library.

(I have already implemented this for gzip files using zlib-rs, and it's working great. So I know it's at least possible. But I far prefer lz4 over gzip because of the decompression speed... however the example code in this crate just has one stream being read in from std::io...)

Use case: I am in WASM. The browser thread sends our Rust program one chunk at a time of a very large lz4-encoded file. The file is split at random points which I have no control over: sometimes the browser sends me 16k, 64k, sometimes 200k bytes. It knows nothing about the underlying data: it just knows it is some sort of compressed binary data and sends it to my Rust thread.

Every time I receive a new chunk, I want to decompress as much as possible and return back to the browser the decompressed data. At the beginning maybe zero bytes can be decompressed because it doesn't have a complete block or frame (not sure how LZ4 works internally). So there are some "leftover" bytes every time that can't be decompressed until the next chunk arrives. This means that with each new arrival, I need to append the "leftovers" from the previous chunk to this new chunk, and try to continue decompressing.

I would return any decompressed data after each submittal, and wait for the next chunk. When the final chunk is sent: hooray! there are no more leftovers! Everything should finished off nicely after the final chunk is submitted, with the final decompressed data sent back to my browser. Done!

Has anyone implemented something similar with a streaming, chunked, lz4-based decompressor? I would absolutely appreciate any hints or sample code or guidance on how to do this. The chatbots are no help... looking for some inspiration.

Perhaps this belongs more on stack overflow than here... please let me know if I am asking for (too much) help in the wrong place.

Kind regards and happy to provide more info.

PSeitz · 2025-01-26T16:16:34Z

I'm not sure this is possible, but I think the Reader passed to the FrameDecoder could block when the next block is not yet available yet. The data for a block needs to be completely available though.

https://github.com/PSeitz/lz4_flex/blob/main/src/frame/decompress.rs#L231

The chatbots are no help... looking for some inspiration.

The code next to the Frame and Block format description should be quite readable, and give you a better understanding than a chatbot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using lz4_flex to decompress a stream of data that arrives in random chunks #177

Using lz4_flex to decompress a stream of data that arrives in random chunks #177

billyc commented Dec 11, 2024

PSeitz commented Jan 26, 2025

Using lz4_flex to decompress a stream of data that arrives in random chunks #177

Using lz4_flex to decompress a stream of data that arrives in random chunks #177

Comments

billyc commented Dec 11, 2024

PSeitz commented Jan 26, 2025