-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compression levels seem to have no effect on output file #254
Comments
Thanks for mentioning this, @FelipeMoser. When you compare file sizes, are you comparing the actual number of bytes in each chunk file (e.g. For example, with bioformats2raw 0.9.3 and an artificial 512x512 image, convert the same input data with 4 different compression settings:
List the size in bytes of every file in both sets of zlib output, followed by the summary size of both outputs:
List the size in bytes of every file in both sets of blosc output, followed by the summary size of both outputs:
In particular, note that You should be able to run that same test to verify, as Note too that bioformats2raw does not itself define what |
Thanks for the quick reply @melissalinkert However, in the example you showed, while the overall size of the .zarr files is roughly the same, this is mainly due to the image being so small relative to the rest of the files. But there is a substantial difference in in the block sizes (1368÷2461=0.56, 2150/2634=0.91). In my case, I'm working with images that are 5-10GB so I expected the difference to large since the meta files should have no significant impact on the final size. For example, for a microscopy image of size [ 1, 3, 1, 19600, 25708 ]:
As you can see, while there is a significant difference between lvl0 and lvl1, the differences between lvl1 and lvl9 are very small. For example, the relative difference between zlib_lvl1 and zlib_lvl9 is 0.58%, 0.47%, and 0.07% for chunk sizes 512, 1025, and 5120, respectively. Is it normal for the difference between the minimal and maximal compression levels to be so small for such large images? Is there something I could be missing? |
It's unfortunately pretty much impossible to make any general statement about the size of compressed data for different zlib and blosc levels. The concept of a level in these compression types is not an indicator of how small the compressed output will be, it's an indicator of how much effort the compressor tries to put into reducing the output size. As such, it's completely data dependent, and the actual percentage reduction in size will vary widely. In particular, zlib level and blosc clevel should not be thought of in terms of image quality or compression ratio. You might try converting your test data without compression, and then independent of bioformats2raw experiment with different compression options on the uncompressed chunk files. That would allow you to confirm that bioformats2raw is not directly causing poor compression, and would allow you to experiment with a wider variety of parameters more quickly; the chosen parameters can then be fed back to bioformats2raw for subsequent conversions. https://github.com/Blosc/bloscpack and/or https://github.com/madler/zlib may be places to start, and the following may also be helpful reading:
Since we've confirmed that |
Hi, I've been playing around with bioformats2raw for a bit and I wanted to compare the different compression options available. I have some ome.tiffs that I'm converting to zarr.
However, I've noticed that the "--compression-properties" argument seems to have no effect on the compression itself outside of the values stored in the .zarray file.
For example, if I set --compression=zlib and --compression-properties="level=1", the file size is exactly the same I get if I set --compression-properties="level=9". Similarly, using the default compression and using --compression-properties="clevel=1" or --compression-properties="clevel=9" results in the exact same file size. There's also no difference in computation time. In both cases, however, the .zarr file does change accordingly.
I'm using bioformats2raw version 0.9.1
[edit] Adding to clarify, if I use zlib, the resulting file size is different than what I get if I use default (blosc). So the --compression argument does seem to work. The issue is only related to "--compression-properties".
The text was updated successfully, but these errors were encountered: