-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty unlimited chunked variables cause crash #67
base: master
Are you sure you want to change the base?
Conversation
@@ -480,6 +480,10 @@ def _get_contiguous_data(self, property_offset): | |||
|
|||
def _get_chunked_data(self, offset): | |||
""" Return data which is chunked. """ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will work - it is also possible to test if the chunk address is UNDEFINED_ADDRESS
, which will happen when no data has been written to the Dataset yet (see change in usnistgov/jsfive@f228420 , which I should have backported to pyfive)
EDIT - I think maybe the test for UNDEFINED_ADDRESS
is important here because sometimes you encounter datasets with non-zero shapes but which have not been written yet (initializing a dataset and writing data to it are two separate steps).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems sensible. At this point I'm not minded to follow through fixing this here, as where it gets done in the new H5D.py will be slightly different. What I have there is:
# look out for an empty dataset, which will have no btree
if np.prod(self.shape) == 0 or dataobject._chunk_address == UNDEFINED_ADDRESS:
self._index = {}
return
(This is in the context of caching the b-tree when we instantiate a DatasetID, which we do when we create a variable instance with eg. `x=myfile['variable']. We do that at this point so that all threads in a thread pool have their b-tree before they get going on their bit of work.)
If ones has created a variable which is intended to be chunked, but it is currently empty, when the file is read, we get a stack dump that ends with this:
This pull request includes code to create a file which manifests the problem, a test to expose it, and a fix.