Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NWB metadata errors not caught by validate #780

Closed
tmchartrand opened this issue Sep 30, 2021 · 13 comments · Fixed by #783
Closed

NWB metadata errors not caught by validate #780

tmchartrand opened this issue Sep 30, 2021 · 13 comments · Fixed by #783

Comments

@tmchartrand
Copy link

Using version 0.27.2
We've gotten several NWB metadata related errors on upload that are not caught by running dandi validate locally beforehand:
AttributeError: 'NoneType' object has no attribute 'capitalize'
ValueError: ISO 8601 expected, but P was received
We've tracked these down or followed up on them separately, but the fundamental issue here is that dandi validate doesn't seem to actually run all of the validation that is run on upload, and so doesn't catch many errors.

@yarikoptic
Copy link
Member

@jwodder let's make validation identical between validate and upload, i.e. add nwb validation to validate if we haven't done that and then use validate within upload.

jwodder added a commit that referenced this issue Oct 4, 2021
@jwodder
Copy link
Member

jwodder commented Oct 4, 2021

@yarikoptic The only difference I see between upload and validate's validation is that the latter is limited to files with the .nwb extension.

@tmchartrand Did the files in question have file extensions of .nwb or something else?

@yarikoptic
Copy link
Member

interesting, I will have a look as well since OP seems to suggest differently.

We should run dandischema validation for all files, but nwb validation only for .nwb files. We are now also sooner than later add BIDS validation for bids datasets (#431), so overall it should be

  • validate against "original" data standard(s) (e.g. there could be BIDS dataset with .nwb files -- we will probably need to validate against nwb, and BIDS)
  • validate extracted / harmonized into dandischema metadata

@jwodder
Copy link
Member

jwodder commented Oct 4, 2021

@yarikoptic I've submitted a PR to make validate validate all files, not just .nwb's. I'm not clear on what exactly you're suggesting in your bullet points.

@yarikoptic
Copy link
Member

@yarikoptic I've submitted a PR to make validate validate all files, not just .nwb's.

Thank you!

I'm not clear on what exactly you're suggesting in your bullet points.

Nothing specific ATM, just notes for near future ("sooner than later") which I will formalize in issues later on.

@satra
Copy link
Member

satra commented Oct 4, 2021

i'm pretty sure these are nwb related errors (age field - i don't know where the nonetype error is coming from). i have a feeling a metadata extraction error is not being seen as invalid in dandi-cli for nwb, and perhaps not being reported during validation.

yarikoptic added a commit that referenced this issue Oct 4, 2021
Give `validate` command an `--allow-any-path` option
@tmchartrand
Copy link
Author

@yarikoptic as it says in the title here, these were all NWB related errors, so changing handling of other file types isn't gonna fix this!
@satra's description sounds like a possible explanation, maybe errors are caught and not reported?

@yarikoptic
Copy link
Member

may be ... do you have (uploaded) a sample of such a file?

@tmchartrand
Copy link
Author

I think any of these files should give the error: https://gui.dandiarchive.org/#/dandiset/000043/draft/files?location=sub-Q19-26-005%2F

@yarikoptic
Copy link
Member

I will reopen for now, although so far failed to reproduce

lena:/tmp/000018
$> dandi validate sub-Q19-26-005/sub-Q19-26-005_ses-20190718T211030_icephys.nwb
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.9/site-packages/hdmf/spec/namespace.py:532: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.1.3 because version 1.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.9/site-packages/hdmf/spec/namespace.py:532: UserWarning: Ignoring cached namespace 'core' version 2.2.4 because version 2.3.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
2021-10-04 14:23:14,881 [    INFO] sub-Q19-26-005/sub-Q19-26-005_ses-20190718T211030_icephys.nwb: ok
Summary: No validation errors among 1 file(s)
2021-10-04 14:23:14,881 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20211004182309Z-1244072.log
dandi validate sub-Q19-26-005/sub-Q19-26-005_ses-20190718T211030_icephys.nwb  5.84s user 0.17s system 97% cpu 6.144 total

$> DANDI_DEVEL=1 dandi upload -i dandi-staging sub-Q19-26-005/sub-Q19-26-005_ses-20190718T211030_icephys.nwb               
2021-10-04 14:23:26,433 [    INFO] Found 1 files to consider
PATH                                                SIZE    ERRORS    UPLOAD STATUS                MESSAGE
...5/sub-Q19-26-005_ses-20190718T211030_icephys.nwb 21.7 MB   0         100% done                         
Summary:                                            21.7 MB         8.6 MB/s 1 done                       
2021-10-04 14:23:28,957 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20211004182324Z-1244106.log

@yarikoptic yarikoptic reopened this Oct 4, 2021
@tmchartrand
Copy link
Author

actually, on a second look, maybe the ones that failed in that experiment didn't make it up. i suppose i can try to email you one of the failures, but really any NWB with a metadata issue should suffice to reproduce this i think.

@yarikoptic
Copy link
Member

might be might be... if you could email me (debian at onerussian.com) -- would be apprecited.

@jwodder jwodder added blocked Blocked by some needed development/fix awaiting-user-response and removed blocked Blocked by some needed development/fix labels Jan 11, 2022
@yarikoptic
Copy link
Member

the problematic one @tmchartrand kindly shared is available from http://www.onerussian.com/tmp/Q19.26.005.2A.01.03.03_dandi.nwb

here is my entire protocol (dirty) of trying to reproduce -- still managed to upload.
(git)lena:/tmp[master]git
$> dandi download --download dandiset.yaml https://gui-staging.dandiarchive.org/#/dandiset/000018
PATH                 SIZE DONE    DONE% CHECKSUM STATUS MESSAGE   
dandiset.yaml                                    done   updated   
Summary:                  0 Bytes                1 done 1 updated 
                          <0.00%                                  
2022-03-25 16:42:05,751 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20220325204203Z-842925.log
(dev3) 1 15079.....................................:Fri 25 Mar 2022 04:42:05 PM EDT:.
(git)lena:/tmp[master]git
$> cd 000018
dandiset.yaml
(dev3) 1 15080.....................................:Fri 25 Mar 2022 04:42:09 PM EDT:.
(git)lena:/tmp[master]000018
$> wget http://www.onerussian.com/tmp/Q19.26.005.2A.01.03.03_dandi.nwb
--2022-03-25 16:42:15--  http://www.onerussian.com/tmp/Q19.26.005.2A.01.03.03_dandi.nwb
Resolving www.onerussian.com (www.onerussian.com)... 129.170.30.229
Connecting to www.onerussian.com (www.onerussian.com)|129.170.30.229|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 21804920 (21M) [text/plain]
Saving to: ‘Q19.26.005.2A.01.03.03_dandi.nwb’

Q19.26.005.2A.01.03.03_dandi.nwb      100%[========================================================================>]  20.79M  4.70MB/s    in 4.5s    

2022-03-25 16:42:20 (4.65 MB/s) - ‘Q19.26.005.2A.01.03.03_dandi.nwb’ saved [21804920/21804920]

(dev3) 1 15081.....................................:Fri 25 Mar 2022 04:42:20 PM EDT:.
(git)lena:/tmp[master]000018
$> dandi --version
0.37.0
(dev3) 1 15082.....................................:Fri 25 Mar 2022 04:42:23 PM EDT:.
(git)lena:/tmp[master]000018
$> dandi validate Q19.26.005.2A.01.03.03_dandi.nwb 
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.9/site-packages/hdmf/spec/namespace.py:532: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.1.3 because version 1.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.9/site-packages/hdmf/spec/namespace.py:532: UserWarning: Ignoring cached namespace 'core' version 2.2.5 because version 2.4.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
2022-03-25 16:42:33,180 [    INFO] Q19.26.005.2A.01.03.03_dandi.nwb: ok
Summary: No validation errors among 1 file(s)
2022-03-25 16:42:33,180 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20220325204228Z-843116.log
dandi validate Q19.26.005.2A.01.03.03_dandi.nwb  5.51s user 1.02s system 117% cpu 5.542 total
(dev3) 1 15083.....................................:Fri 25 Mar 2022 04:42:33 PM EDT:.
(git)lena:/tmp[master]000018
$> DANDI_DEVEL=1 dandi upload -i dandi-staging  Q19.26.005.2A.01.03.03_dandi.nwb
2022-03-25 16:43:08,749 [    INFO] Found 1 files to consider
PATH                             SIZE    ERRORS UPLOAD STATUS           MESSAGE      
Q19.26.005.2A.01.03.03_dandi.nwb 21.8 MB   0           skipped          file exists  
Summary:                         21.8 MB               1 skipped        1 file exists
2022-03-25 16:43:09,101 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20220325204306Z-843295.log
(dev3) 1 15084.....................................:Fri 25 Mar 2022 04:43:09 PM EDT:.
(git)lena:/tmp[master]000018
$> dandi remove Q19.26.005.2A.01.03.03_dandi.nwb
Usage: dandi [OPTIONS] COMMAND [ARGS]...
Try 'dandi --help' for help.

Error: No such command 'remove'.
(dev3) 1 15085 ->2.....................................:Fri 25 Mar 2022 04:43:18 PM EDT:.
(git)lena:/tmp[master]000018
$> dandi detele Q19.26.005.2A.01.03.03_dandi.nwb
Usage: dandi [OPTIONS] COMMAND [ARGS]...
Try 'dandi --help' for help.

Error: No such command 'detele'.

Did you mean one of these?
    delete
    digest
(dev3) 1 15086 ->2.....................................:Fri 25 Mar 2022 04:43:23 PM EDT:.
(git)lena:/tmp[master]000018
$> dandi delete Q19.26.005.2A.01.03.03_dandi.nwb
2022-03-25 16:43:29,955 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20220325204327Z-843442.log
Error: Asset at path 'Q19.26.005.2A.01.03.03_dandi.nwb' not found in Dandiset 000018
(dev3) 1 15087 ->1.....................................:Fri 25 Mar 2022 04:43:30 PM EDT:.
(git)lena:/tmp[master]000018
$> dandi delete -i dandi-staging Q19.26.005.2A.01.03.03_dandi.nwb
Delete 1 assets on server from Dandiset 000018? [y/N]: y
PATH                             STATUS     MESSAGE
Q19.26.005.2A.01.03.03_dandi.nwb Deleted           
Summary:                         1 Deleted         
2022-03-25 16:43:46,422 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20220325204341Z-843548.log
(dev3) 1 15088.....................................:Fri 25 Mar 2022 04:43:46 PM EDT:.
(git)lena:/tmp[master]000018
$> DANDI_DEVEL=1 dandi upload -i dandi-staging Q19.26.005.2A.01.03.03_dandi.nwb
2022-03-25 16:43:52,791 [    INFO] Found 1 files to consider
PATH                             SIZE    ERRORS UPLOAD STATUS                MESSAGE
Q19.26.005.2A.01.03.03_dandi.nwb 21.8 MB   0           done                         
Summary:                         21.8 MB               1 done                       
2022-03-25 16:43:54,562 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20220325204350Z-843601.log

Let's just assume that we fixed something ;) but if issue persists @tmchartrand with current dandi-cli 0.37.0 or later, certainly reopen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants