-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move written flag from builder to HDF5IO, add tests #381
Conversation
src/hdmf/backends/hdf5/h5tools.py
Outdated
@@ -884,10 +913,10 @@ def _filler(): | |||
# NOTE: we can ignore options['io_settings'] for scalar data | |||
elif self.__is_ref(options['dtype']): | |||
_dtype = self.__dtypes.get(options['dtype']) | |||
self.__set_written(builder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all branches of the if statement below had builder.written = True
so I moved it to before the if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The written flag should be set after the if/elif statements since that is when the state is modified. E.g., if an error occurs in require_dataset
then the builder is falsely recorded as written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense.
Codecov Report
@@ Coverage Diff @@
## hdmf_2.0 #381 +/- ##
=========================================
Coverage 74.55% 74.55%
=========================================
Files 33 33
Lines 6633 6633
Branches 1447 1446 -1
=========================================
Hits 4945 4945
Misses 1279 1279
Partials 409 409
Continue to review full report at Codecov.
|
Also consider the use case where a user writes an in-memory container to a file, then adds to the in-memory container, then writes the updated in-memory container to the same file using the following code: def test_write_append_unwritten(self):
"""Test writing a container, adding to the in-memory container, then writing it again in append mode."""
with HDF5IO(self.path, manager=self.manager, mode='w') as io:
io.write(self.foofile)
# append new container
foo3 = Foo('foo3', [10, 20], "I am foo3", 2, 0.1)
new_bucket1 = FooBucket('new_bucket1', [foo3])
self.foofile.buckets.append(new_bucket1)
new_bucket1.parent = self.foofile
# write to same file with same manager in APPEND mode
with HDF5IO(self.path, manager=self.manager, mode='a') as io:
io.write(self.foofile) Previously this would succeed because the builder for Although possibly confusing, I think this is OK behavior. Users should NOT use the above code to write an in-memory container and then append to it. Instead, they should do one of the following:
|
builder_id = self.__builderhash(builder) | ||
return self._written_builders.get(builder_id, False) | ||
|
||
def __builderhash(self, obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same function is implemented here:
hdmf/src/hdmf/build/manager.py
Lines 193 to 194 in 3fa0587
def __bldrhash__(self, obj): | |
return id(obj) |
Not sure if it should be generalized, by just adding __hash__
to Builder. It seems simple enough that its not a big deal to leave as is, but thought I would mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's where I got it from. I figured there might have been a reason that it was on the BuildManager
rather than on Builder
in the first place, e.g., maybe builders should by default have a hash that is based on its contents, so I left it and re-implemented it here.
Note that since this technically includes a breaking change, this PR is being merged to a branch for HDMF 2.0, not dev. |
Motivation
Required for the export PR: #326.
Currently,
Builder
objects have a fieldwritten
that is used only byHDF5IO
to mark whether theBuilder
object has been written to disk so that it does not get written multiple times in case of links and references.Builder
objects should contain minimal state information such as whether it has been written, and this should really be information stored in the IO object. Refactoring to makeHDF5IO
track whether a builder has been written by that object will allow us to use a differentHDF5IO
instance to write the same builder to a different file.Note:
HDF5IO.__set_written
is private because no other class should be able to change whether a builder has been written by this IO object.HDF5IO.get_written
is public because other backends and code may want to be able to know whether a builder has been written to disk, but this could be made private.test_double_cache_spec
with a new set of tests for writing/appending to check that the written flag is used correctly.Close #380.
How to test the behavior?
See tests.
Checklist
flake8
from the source directory.