-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Allow opening (raw) compressed archive entries in ZipArchiveEntry #63155
Comments
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue DetailsBackground and motivationRight now, Allow access to the raw compressed streams in the zip file, and the compression method flag in the entry.
I am far from an expert on the zip file format, but from my rudimentary understanding of it, this should be possible? API Proposalnamespace System.IO.Compression
{
public class ZipArchiveEntry
{
public ZipCompressionMethod CompressionMethod { get; }
public Stream OpenRaw();
}
public class ZipArchive
{
public ZipArchiveEntry CreateEntry(string entryName, ZipCompressionMethod compression);
}
public enum ZipCompressionMethod : short
{
// Corresponds to the compression method described by APPNOTE.TXT section 4.4.5
Stored = 0,
Deflate = 0,
Bzip2 = 12,
Lzma = 14,
Zstd = 93
}
} API UsageUsing third-party decompression streams with var zipArchive = new ZipArchive(..., ZipArchiveMode.Read);
var entry = zipArchive.GetEntry("foo.json");
Debug.Assert(entry.CompressionMethod == ZipCompressionMethod.Zstd);
// Imagine a ZstdStream from a third-party library.
var stream = new ZstdStream(entry.OpenRaw(), CompressionMode.Decompress); Copying compressed blobs between zip files: ZipArchive a = ...;
ZipArchive b = ...;
var aEntry = a.GetEntry("foo.json");
var bEntry = b.CreateEntry("foo.json", aEntry.CompressionMode);
aEntry.OpenRaw().CopyTo(bEntry.OpenRaw()); Alternative DesignsNo response RisksNo response
|
I have a real-world use case for this also. I recently implemented my own incomplete parser for ZIP archives to use LibDeflate as the decompressor, which got me some nice speedups. It would be nice to be able to use the structure parsing with my own compression libs. |
My use cases are that I want to be able to use zip files (because it's a standard format) but with LZMA (significant space savings for my use case) while also being able to instantly dump these blobs into an SQLite DB (while still compressed). Another use case I have is that I want to basically use zip files as an object storage from an API and being able to use the compressed blobs to throw them over the wire directly would be great. This would hit multiple birds with one stone. |
Having an enum that requires a third-party library to supply that compression algorithm is likely to cause confusion. At least some compression libraries add a header to the compressed stream - that being the case, if the constructor instead took something like public interface IZipCompressionStream {
public string CompressionMethod;
public ReadOnlySpan<byte> Header;
public Stream Compress(Stream raw);
public bool TryDecompress(Stream compressed, out Stream raw);
public Stream Decompress(Stream compressed);
} ... this would allow for arbitrary compression methods, including ones not currently envisioned |
It is a lower level API that simply exposes more information about the underlying zip file format. Python also exposes the Limiting the enum members to the compression methods supported by .NET today would be an option, which I suppose is closer to what Python does in this regard.
Relying on such headers is silly for zip files, since they already have a standardized 2-byte entry field for compression method. This entire |
@Clockwork-Muse I think the API should follow the standard (though which of the specified compression methods should be named members of the |
Ah, I was not aware that zip itself listed the possible methods, mybad. |
@carlossanlop what is your take on this? Would adding such API help to implement algorithms that are currently not supported OOTB? |
Background and motivation
Right now,
ZipArchive
only supports opening entries compressed withStored
,Deflate
andDeflate64
. While there are open issues about adding support for more specified methods such as LZMA, I would like to propose an orthogonal solution to this problem.Allow access to the raw compressed streams in the zip file, and the compression method flag in the entry.
This opens up a few possibilities:
I am far from an expert on the zip file format, but from my rudimentary understanding of it, this should be possible?
API Proposal
API Usage
Using third-party decompression streams with
ZipArchive
:Copying compressed blobs between zip files:
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: