Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize netcdf chunk size and compression #433

Open
andrrizzi opened this issue Jul 21, 2019 · 1 comment
Open

Optimize netcdf chunk size and compression #433

andrrizzi opened this issue Jul 21, 2019 · 1 comment

Comments

@andrrizzi
Copy link
Contributor

andrrizzi commented Jul 21, 2019

See also #422 and choderalab/yank#1157.

I did a little investigation on chunk size and compression in netcdf. Here's a few thoughts.

Tl; dr

We should not have always a chunk size iteration dimension equal to 1, but we should implement a heuristic to make sure that

  • The chunk size is not too small w.r.t. the file system block size (16 KB on lilac).
  • We should raise a warning if the chunk size of a variable is greater than its chunk cache.
  • When we want zlib compression, the chunk size should probably be at least >= 64KB or very little compression is done.
  • For the solute trajectory, we can set least_significant_digit to truncate precision, which helps zlib and saves a lot of space and time.
  • Consider that HDF5/netcdf parallel writing doesn't support neither compression nor chunk caching (see also Parallelize storage reading/writing #429).

More details

On Lilac the minimum block size that can be allocated (or subblock in GPFS [1,2]) is 16KB (you can check it with mmlsfs all -f [3]), which means that having chunk sizes smaller than 16KB doesn't make much sense [4]. This can end up wasting hundreds of MB (and the time required to read/write them) for small chunks like "box_vectors", "volumes", and "replica_thermodynamic_state" when we have thousands of iterations if we let the chunk size iteration dimension equal to 1. In principle, os.statvfs(os.getcwd()).f_frsize gives you the disk page size, in practice it doesn't work on GPFS because that gives you the block size instead of the subblock [1,2].

The default sliding window of zlib is 64KB which means we won't be able to save much space with chunks smaller than that (the compression is done per-chunk). The solute trajectory, which has only tens or very few hundreds atoms, is another instance where having the chunk size iteration dimension greater than 1 can help saving disk space (and thus reading/writing time). We can also do something similar to xtc and reduce the precision of the solute-only trajectory by setting least_significant_digit to pm. This helps zlib a lot [5]. However, with Parallel I/O, zlib compression can't be used for writing [6].

I believe the fact that the chunk size for the trajectory was much bigger than the chunk cache was at the bottom of our performance problem. With a large enough cache, and probably without forcing a sync, netcdf should not flush the positions at every iteration. However, we need to consider that with parallel netcdf, the chunk cache is deactivated [7].

Links

[1] https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_fsblksz.htm
[2] https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_frags.htm
[3] https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.3/com.ibm.spectrum.scale.v5r03.doc/bl1adm_listfsa.htm
[4] https://www.unidata.ucar.edu/blogs/developer/en/entry/chunking_data_choosing_shapes
[5] https://unidata.github.io/netcdf4-python/netCDF4/index.html
[6] https://www.unidata.ucar.edu/software/netcdf/workshops/2010/nc4chunking/CompressionAdvice.html
[7] https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_perf_chunking.html

@jchodera
Copy link
Member

This all sounds great, @andrrizzi!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants