Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using lz4_flex to decompress a stream of data that arrives in random chunks #177

Open
billyc opened this issue Dec 11, 2024 · 1 comment

Comments

@billyc
Copy link

billyc commented Dec 11, 2024

Hi, newbie to Rust here, but very excited so far =) and I am a bit unsure how to accomplish what I want with this library.

(I have already implemented this for gzip files using zlib-rs, and it's working great. So I know it's at least possible. But I far prefer lz4 over gzip because of the decompression speed... however the example code in this crate just has one stream being read in from std::io...)

Use case: I am in WASM. The browser thread sends our Rust program one chunk at a time of a very large lz4-encoded file. The file is split at random points which I have no control over: sometimes the browser sends me 16k, 64k, sometimes 200k bytes. It knows nothing about the underlying data: it just knows it is some sort of compressed binary data and sends it to my Rust thread.

Every time I receive a new chunk, I want to decompress as much as possible and return back to the browser the decompressed data. At the beginning maybe zero bytes can be decompressed because it doesn't have a complete block or frame (not sure how LZ4 works internally). So there are some "leftover" bytes every time that can't be decompressed until the next chunk arrives. This means that with each new arrival, I need to append the "leftovers" from the previous chunk to this new chunk, and try to continue decompressing.

I would return any decompressed data after each submittal, and wait for the next chunk. When the final chunk is sent: hooray! there are no more leftovers! Everything should finished off nicely after the final chunk is submitted, with the final decompressed data sent back to my browser. Done!

Has anyone implemented something similar with a streaming, chunked, lz4-based decompressor? I would absolutely appreciate any hints or sample code or guidance on how to do this. The chatbots are no help... looking for some inspiration.

Perhaps this belongs more on stack overflow than here... please let me know if I am asking for (too much) help in the wrong place.

Kind regards and happy to provide more info.

@PSeitz
Copy link
Owner

PSeitz commented Jan 26, 2025

I'm not sure this is possible, but I think the Reader passed to the FrameDecoder could block when the next block is not yet available yet. The data for a block needs to be completely available though.

https://github.com/PSeitz/lz4_flex/blob/main/src/frame/decompress.rs#L231

The chatbots are no help... looking for some inspiration.

The code next to the Frame and Block format description should be quite readable, and give you a better understanding than a chatbot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants