Joint BIDS-NWB metadata extraction. #1183

TheChymera · 2023-01-03T20:27:52Z

Closes: #1172

codecov · 2023-01-03T20:33:10Z

Codecov Report

Base: 89.08% // Head: 89.16% // Increases project coverage by +0.07% 🎉

Coverage data is based on head (4aca649) compared to base (d958555).
Patch coverage: 81.81% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1183      +/-   ##
==========================================
+ Coverage   89.08%   89.16%   +0.07%     
==========================================
  Files          76       76              
  Lines        9448     9469      +21     
==========================================
+ Hits         8417     8443      +26     
+ Misses       1031     1026       -5

Flag	Coverage Δ
unittests	`89.16% <81.81%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
dandi/tests/fixtures.py	`97.59% <40.00%> (-1.01%)`	⬇️
dandi/metadata.py	`87.55% <83.33%> (+0.08%)`	⬆️
dandi/files/bids.py	`97.47% <100.00%> (+2.52%)`	⬆️
dandi/tests/test_metadata.py	`100.00% <100.00%> (ø)`
dandi/tests/test_files.py	`100.00% <0.00%> (ø)`
dandi/files/bases.py	`78.72% <0.00%> (+1.41%)`	⬆️
dandi/support/threaded_walk.py	`94.82% <0.00%> (+1.72%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

TheChymera · 2023-01-04T00:29:07Z

@yarikoptic any ideas why this test requires docker? Does it need some upload function to generate the SampleDandiset? (I have docker, ofc, just very surprising that this sets it off or gets skipped if I suspend the service)

(dev) [deco]~/src/dandi-cli ❱ pytest -vvs dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
=========================================================== test session starts ============================================================
platform linux -- Python 3.10.9, pytest-7.2.0, pluggy-1.0.0 -- /usr/bin/python3.10
cachedir: .pytest_cache
rootdir: /home/chymera/src/dandi-cli, configfile: tox.ini
plugins: pkgcore-0.12.18, mock-3.10.0
collected 1 item

dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration Cloning into '/tmp/pytest-of-chymera/pytest-6/gitrepo0'...
remote: Enumerating objects: 2490, done.
remote: Counting objects: 100% (2490/2490), done.
remote: Compressing objects: 100% (1786/1786), done.
remote: Total 2490 (delta 541), reused 1899 (delta 364), pack-reused 0
Receiving objects: 100% (2490/2490), 371.26 KiB | 1.66 MiB/s, done.
Resolving deltas: 100% (541/541), done.
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 5 (delta 0), reused 3 (delta 0), pack-reused 0
Receiving objects: 100% (5/5), 7.57 KiB | 7.57 MiB/s, done.
remote: Enumerating objects: 142, done.
remote: Counting objects: 100% (142/142), done.
remote: Compressing objects: 100% (97/97), done.
remote: Total 142 (delta 38), reused 121 (delta 36), pack-reused 0
Receiving objects: 100% (142/142), 361.83 KiB | 2.62 MiB/s, done.
Resolving deltas: 100% (38/38), done.
errors pretty printing info
SKIPPED (docker engine not running)

=========================================================== slowest 10 durations ===========================================================
3.09s setup    dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
0.00s teardown dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
============================================================ 1 skipped in 3.18s ============================================================
(dev) [deco]~/src/dandi-cli/ ❱ ag bids_nwb_dandiset -B 2 -A 3 dandi/tests/test_metadata.py
52-
53-
54:def test_bids_nwb_metadata_integration(bids_nwb_dandiset: SampleDandiset) -> None:
55:    metadata = get_metadata(bids_nwb_dandiset)
56-    print(metadata)
57-
58-

dandi/tests/fixtures.py

TheChymera · 2023-01-06T06:39:53Z

Hm, so hitting a digest issue again, perhaps this PR could cautiously be extended to cover #1178 as well, since it might require sorting that bit out...

@jwodder sorry to ping you again, but I might benefit from your insight on this. Did I make any obvious mistake in d99891f? I'm basically trying to generate a fake digest as per the code thus far used in

dandi-cli/dandi/cli/cmd_ls.py

Line 350 in 9aae3bf

if use_fake_digest:

.

Alternatively, and in the context of the overall digest discussion, is there any way to just pull the plug on it in a more comprehensive manner? And if so would it be at all advisable?

jwodder · 2023-01-06T14:07:10Z

@TheChymera Print out the value of metadata before this line and report back.

is there any way to just pull the plug on it in a more comprehensive manner? And if so would it be at all advisable?

I'm not sure what you mean. dandi-schema requires a digest in order to construct a BareAsset. If you want to change that, talk to Satra.

TheChymera · 2023-01-10T16:47:00Z

@jwodder

(dev) [deco]~/src/dandi-cli ❱ pytest -vvs dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
=========================================================== test session starts ============================================================
platform linux -- Python 3.10.9, pytest-7.2.0, pluggy-1.0.0 -- /usr/bin/python3.10
cachedir: .pytest_cache
rootdir: /home/chymera/src/dandi-cli, configfile: tox.ini
plugins: pkgcore-0.12.18, mock-3.10.0
collected 1 item

dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration Cloning into '/tmp/pytest-of-chymera/pytest-9/gitrepo0'...
remote: Enumerating objects: 2490, done.
remote: Counting objects: 100% (2490/2490), done.
remote: Compressing objects: 100% (1786/1786), done.
remote: Total 2490 (delta 541), reused 1898 (delta 364), pack-reused 0
Receiving objects: 100% (2490/2490), 371.24 KiB | 1.39 MiB/s, done.
Resolving deltas: 100% (541/541), done.
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 5 (delta 0), reused 3 (delta 0), pack-reused 0
Receiving objects: 100% (5/5), 7.57 KiB | 7.57 MiB/s, done.
remote: Enumerating objects: 142, done.
remote: Counting objects: 100% (142/142), done.
remote: Compressing objects: 100% (97/97), done.
remote: Total 142 (delta 38), reused 121 (delta 36), pack-reused 0
Receiving objects: 100% (142/142), 361.83 KiB | 2.68 MiB/s, done.
Resolving deltas: 100% (38/38), done.
None
AAAAAAAAAAAAAAAAAAAAAAAA
Digest(algorithm=<DigestType.dandi_etag: 'dandi:dandi-etag'>, value='00000000000000000000000000000000-1')
{'schemaKey': 'Asset', 'schemaVersion': '0.6.3', 'access': [{'schemaKey': 'AccessRequirements', 'status': 'dandi:OpenAccess'}], 'wasGeneratedBy': [{'schemaKey': 'Session', 'identifier': 'postimp', 'name': 'postimp'}, Activity(id='urn:uuid:955453a3-1239-4ef4-841b-71878266f56f', schemaKey='Activity', identifier=None, name='Metadata generation', description='Metadata generated by DANDI cli', startDate=None, endDate=None, wasAssociatedWith=[Software(id=None, schemaKey='Software', identifier='RRID:SCR_019009', name='DANDI Command Line Interface', version='0.48.0+10.gd99891f.dirty', url=HttpUrl('https://github.com/dandi/dandi-cli', ))], used=None)], 'wasAttributedTo': [{'schemaKey': 'Participant', 'identifier': '01'}], 'digest': {}, 'dateModified': datetime.datetime(2023, 1, 10, 11, 45, 18, 11343, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=68400), 'EST')), 'blobDateModified': datetime.datetime(2023, 1, 10, 11, 45, 16, 932084, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=68400), 'EST')), 'contentSize': 19664, 'encodingFormat': 'application/octet-stream', 'path': 'sub-01_ses-postimp_task-seizure_run-01_ieeg.nwb'}
FAILED

================================================================= FAILURES =================================================================
____________________________________________________ test_bids_nwb_metadata_integration ____________________________________________________
dandi/tests/test_metadata.py:66: in test_bids_nwb_metadata_integration
    metadata = get_metadata(file_path)
/usr/lib/python3.10/site-packages/fscacher/cache.py:152: in fingerprinter
    ret = fingerprinted(*args, **kwargs_)
/usr/lib/python3.10/site-packages/joblib/memory.py:594: in __call__
    return self._cached_call(args, kwargs)[0]
/usr/lib/python3.10/site-packages/joblib/memory.py:537: in _cached_call
    out, metadata = self.call(*args, **kwargs)
/usr/lib/python3.10/site-packages/joblib/memory.py:779: in call
    output = self.func(*args, **kwargs)
/usr/lib/python3.10/site-packages/fscacher/cache.py:98: in fingerprinted
    return f(path, *args, **kwargs)
dandi/metadata.py:103: in get_metadata
    path_metadata = df.get_metadata(digest=digest)
dandi/files/bids.py:230: in get_metadata
    bids_metadata = BIDSAsset.get_metadata(self)
dandi/files/bids.py:201: in get_metadata
    return BareAsset(**metadata)
pydantic/main.py:342: in pydantic.main.BaseModel.__init__
    ???
E   pydantic.error_wrappers.ValidationError: 1 validation error for BareAsset
E   digest
E     A non-zarr asset must have a dandi-etag. (type=value_error)
------------------------------------------------------------ Captured log call -------------------------------------------------------------
WARNING  bids-schema:validator.py:594 BIDSVersion `1.7.0` is less than the minimal working `schema`. Falling back to `schema`. To force the usage of earlier versions specify them explicitly when calling the validator.
=========================================================== slowest 10 durations ===========================================================
2.23s setup    dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
1.06s call     dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
0.00s teardown dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration
========================================================= short test summary info ==========================================================
FAILED dandi/tests/test_metadata.py::test_bids_nwb_metadata_integration - pydantic.error_wrappers.ValidationError: 1 validation error for BareAsset
============================================================ 1 failed in 3.55s =============================================================
(dev) [deco]~/src/dandi-cli ❱ git rev-parse HEAD
67a63a2e8d1333766c0ca7d589a1816531c5e5b1

It appears we do indeed have the etag Digest(algorithm=<DigestType.dandi_etag: 'dandi:dandi-etag'>, value='00000000000000000000000000000000-1') but for some reason it's not getting recognized 🤔

jwodder · 2023-01-10T16:50:58Z

@TheChymera Pass --showlocals to pytest and report the results.

TheChymera · 2023-01-11T06:11:06Z

@jwodder https://ppb.chymera.eu/650109.log

jwodder · 2023-01-11T14:05:11Z

@TheChymera This should fix it:

diff --git a/dandi/files/bids.py b/dandi/files/bids.py
index d2238c3..3c0d910 100644
--- a/dandi/files/bids.py
+++ b/dandi/files/bids.py
@@ -226,7 +226,7 @@ class NWBBIDSAsset(BIDSAsset, NWBAsset):
         digest: Optional[Digest] = None,
         ignore_errors: bool = True,
     ) -> BareAsset:
-        bids_metadata = BIDSAsset.get_metadata(self)
+        bids_metadata = BIDSAsset.get_metadata(self, digest, ignore_errors)
         nwb_metadata = NWBAsset.get_metadata(self, digest, ignore_errors)
         return BareAsset(
             **{**bids_metadata.dict(), **nwb_metadata.dict(exclude_none=True)}

TheChymera · 2023-01-17T13:18:03Z

Although this doesn't really touch the logic of BIDS validation, apparently this PR introduces 2 new test failures in existing tests... Tried to debug this yesterday (have a commit littered with print calls, but it really didn't help me narrow it down).

Checking the logs, I get

2023-01-17T07:41:56-0500 [DEBUG ] dandi 4324:140070214657856 Problem obtaining metadata for asl003/sub-Sub1/anat/sub-Sub1_T1w.nii.gz: Unable to get metadata from non-BIDS, non-NWB asset.

And I have absolutely no clue why... I'll continue looking, but if you have any ideas @yarikoptic @jwodder let me know.

TheChymera · 2023-01-17T15:03:42Z

I think I figured it out.

yarikoptic

left some suggestion which I am yet to pursue "deeper" as well since it seems there is some unclear stack of semantics in what extracts what metadata

dandi/metadata.py

TheChymera · 2023-01-23T19:26:17Z

@jwodder could we merge?

dandi/tests/test_metadata.py

yarikoptic

just left 1 suggestion to adopt to avoid obscure mix of / and \ in paths on windows. The rest is ok, let's proceed after suggestion is adopted

github-actions · 2023-02-10T21:46:37Z

🚀 PR was released in 0.49.0 🚀

TheChymera added tests Add or improve existing tests BIDS NWB labels Jan 3, 2023

TheChymera changed the title ~~Whitelisting new BIDS-NWB dataset~~ Joint BIDS-NWB metadata extraction. Jan 3, 2023

yarikoptic reviewed Jan 4, 2023

View reviewed changes

dandi/tests/fixtures.py Show resolved Hide resolved

TheChymera marked this pull request as ready for review January 17, 2023 16:10

TheChymera requested a review from yarikoptic January 17, 2023 16:10

yarikoptic requested changes Jan 18, 2023

View reviewed changes

dandi/metadata.py Show resolved Hide resolved

dandi/metadata.py Show resolved Hide resolved

dandi/metadata.py Outdated Show resolved Hide resolved

TheChymera added 12 commits January 23, 2023 13:54

Whitelisting new BIDS-NWB dataset

f8cf659

Trying to concatenate metadata

68fc3c7

Test fixture for combined BIDS+NWB dataset

280c1f9

Checking metadata for BIDS-NWB file specifically

1a8b2db

Generate fake digests

53c9fd1

Printing metadata (debugging)

af43da4

Fixed digest error, removed debugging print calls.

5e0c040

Renamed variable to satisfy typing

8fe8980

Debug

999851a

Better checking whether either NWB or BIDS metadata was produced

17cf50e

Removed docker-dependent variant of test

af97ad5

Removed redundant code

2a3b0e9

Returning path in error message

a438222

TheChymera force-pushed the bidsnwb branch from 7447000 to a438222 Compare January 23, 2023 18:55

TheChymera requested a review from yarikoptic January 23, 2023 19:26

yarikoptic reviewed Jan 23, 2023

View reviewed changes

dandi/tests/test_metadata.py Show resolved Hide resolved

yarikoptic reviewed Jan 23, 2023

View reviewed changes

dandi/tests/test_metadata.py Outdated Show resolved Hide resolved

Testing more keys

de9061a

yarikoptic reviewed Jan 25, 2023

View reviewed changes

dandi/tests/test_metadata.py Show resolved Hide resolved

yarikoptic reviewed Jan 25, 2023

View reviewed changes

dandi/tests/test_metadata.py Outdated Show resolved Hide resolved

yarikoptic approved these changes Jan 25, 2023

View reviewed changes

More windows-safety

4aca649

yarikoptic merged commit 94c862f into master Jan 30, 2023

yarikoptic deleted the bidsnwb branch January 30, 2023 21:05

github-actions bot added the released label Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joint BIDS-NWB metadata extraction. #1183

Joint BIDS-NWB metadata extraction. #1183

TheChymera commented Jan 3, 2023

codecov bot commented Jan 3, 2023 •

edited

Loading

TheChymera commented Jan 4, 2023 •

edited

Loading

TheChymera commented Jan 6, 2023

jwodder commented Jan 6, 2023

TheChymera commented Jan 10, 2023 •

edited

Loading

jwodder commented Jan 10, 2023

TheChymera commented Jan 11, 2023

jwodder commented Jan 11, 2023

TheChymera commented Jan 17, 2023

TheChymera commented Jan 17, 2023

yarikoptic left a comment

TheChymera commented Jan 23, 2023

yarikoptic left a comment

github-actions bot commented Feb 10, 2023

Joint BIDS-NWB metadata extraction. #1183

Joint BIDS-NWB metadata extraction. #1183

Conversation

TheChymera commented Jan 3, 2023

codecov bot commented Jan 3, 2023 • edited Loading

Codecov Report

TheChymera commented Jan 4, 2023 • edited Loading

TheChymera commented Jan 6, 2023

jwodder commented Jan 6, 2023

TheChymera commented Jan 10, 2023 • edited Loading

jwodder commented Jan 10, 2023

TheChymera commented Jan 11, 2023

jwodder commented Jan 11, 2023

TheChymera commented Jan 17, 2023

TheChymera commented Jan 17, 2023

yarikoptic left a comment

Choose a reason for hiding this comment

TheChymera commented Jan 23, 2023

yarikoptic left a comment

Choose a reason for hiding this comment

github-actions bot commented Feb 10, 2023

codecov bot commented Jan 3, 2023 •

edited

Loading

TheChymera commented Jan 4, 2023 •

edited

Loading

TheChymera commented Jan 10, 2023 •

edited

Loading