Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support streaming compression #64

Merged
merged 3 commits into from
Sep 6, 2021
Merged

Conversation

milesgranger
Copy link
Owner

@milesgranger milesgranger commented Sep 2, 2021

Adds Compressor object for all variants allowing for streaming compression.
Mimicks the API of brotli.Compressor except it will return the number of bytes written instead of b''

ie

>>> import cramjam
>>> compressor = cramjam.gzip.Compressor()
>>> compressor.compress(b'foo')
3
>>> compressor.compress(b'bar')
3
>>> out = compressor.finish()
>>> bytes(out)
b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\xffK\xcb\xcfOJ,\x02\x00\x95\x1f\xf6\x9e\x06\x00\x00\x00'
>>> decompressed = cramjam.gzip.decompress(out)
>>> decompressed
cramjam.Buffer(len=6)
>>> bytes(decompressed)
b'foobar'
>>> 

Brought to my attention in developmentseed/starlette-cramjam#1


I'm not convinced that a Decompressor would be needed. In brotli, brotli.Decompressor().decompress() behaves exactly the same as brotli.decompress. And I suppose most (all?) algorithms cannot decompress some arbitrary amount of bytes from a previously encoded stream. And in python-snappy, the snappy.stream_decompress seems to perform basically what decompress_into does; a compressed input into another decompressed output.


TODO:

  • Ideally BytesType as the output type, but think the lifetime parameter won't work well with #[pyclass]. And unlikely able to change that since BytesType can (and should) be able to reference Python owned types like bytes/bytearray/etc.
    • Seemingly too painful given the required lifetime parameter of BytesType; will try more later
  • Generalize the .finish() methods in Rust. However, not all encoders follow the same API, so a few are slightly different.

src/brotli.rs Outdated Show resolved Hide resolved
@milesgranger
Copy link
Owner Author

@messense

Sorry to bother; we're suddenly failing on linux-cross, both from 0.14.1 and 0.14.4
I don't think anything in this PR is the culprit and verified in #65 that master fails as well.

Compiling pyo3-macros v0.14.4
error[E0432]: unresolved import self::string::PyStringData
--> /root/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/pyo3-0.14.4/src/types/mod.rs:27:9
|
27 | pub use self::string::PyStringData;
| ^^^^^^^^^^^^^^------------
| | |
| | help: a similar name exists in the module: PyString
| no PyStringData in types::string

error[E0425]: cannot find function, tuple struct or tuple variant PyUnicode_IS_READY in this scope
--> /root/.cargo/registry/src/github.jparrowsec.cn-1ecc6299db9ec823/pyo3-0.14.4/src/ffi/cpython/unicodeobject.rs:218:19
|
218 | debug_assert!(PyUnicode_IS_READY(op) != 0);
| ^^^^^^^^^^^^^^^^^^ not found in this scope

error: aborting due to 2 previous errors

@messense
Copy link
Contributor

messense commented Sep 2, 2021

See PyO3/pyo3#1824 (comment) , just need to wait for pyo3 0.14.5.

@milesgranger
Copy link
Owner Author

Thanks, and sorry! I looked at the same issue but didn't go far enough down I guess... 😅

@messense
Copy link
Contributor

messense commented Sep 6, 2021

we're suddenly failing on linux-cross, both from 0.14.1 and 0.14.4

BTW, consider committing the Cargo.lock file to git so that this won't happen(0.14.1 should work, it didn't work for you because lockfile is missing so Cargo always choose the highest possible version).

@milesgranger milesgranger marked this pull request as ready for review September 6, 2021 18:05
@milesgranger milesgranger force-pushed the streaming-updates-n-fixes branch from 3191ad2 to 79f64a0 Compare September 6, 2021 18:14
@milesgranger milesgranger merged commit 0a93fd8 into master Sep 6, 2021
@milesgranger milesgranger deleted the streaming-updates-n-fixes branch September 6, 2021 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants