-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] clarify that filters should be specified as object of objects #348
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me. As I've stated elsewhere, the only condition under which I would strenuously object is if order can ever matter, for example, if nonlinear filters can be used. If order can matter, I think the cost in fixing examples and potentially invalidating datasets is worth it.
Also worth considering whether people previously read "list" and interpreted it literally; you may be invalidating datasets that were valid under that interpretation. Checking ///openneuro
via datalad (which doesn't include all current datasets...), the only examples I see are "n/a"
, {"SpatialCompensation":{"GradientOrder":"3rd"}}
and {"SpatialCompensation":{"GradientOrder": 3}}
. So this isn't a definite concern...
I'll leave approving reviews to ephys people, who know more about these filters than I do.
Thinking about this issue once more, I am not sure if it's really a big difference to have a list or not. It's true that the order could matter but in practice, since this is not a derivatives specification yet, the software filter field is for storing the bare minimum preprocessing that has been done to make the files shareable. This could be, for instance, Maxfilter for Elekta systems and gradient compensation for CTF systems. We don't want to encourage users to hack this to share what could be considered derivative files. |
That is true, but these datasets would not have passed the validator, so these cases are confined to those were people only read the spec, and never used the validator. I interpret @jasmainak's comment above as another pro for dealing with the issue in the way proposed in this PR. Or did you mean it in another way Mainak? |
yes it is pro. Do you think though that the current phrasing introduces more flexibility in the specification? I'd prefer something like dictionary as opposed to object |
That's why I put a link there to "JSON OBJECT", where it is unambiguously defined what we expect: https://www.w3schools.com/js/js_json_objects.asp |
right but a JSON object can contain almost anything unless I'm mistaken? |
If you change to something like dictionary of dictionary, where the values of key-value pair are restricted to be strings, then you reduce the risk of breaking backwards compatibility. However, this would need a tweak to the JSON schemas in the validator. |
We seem to use "object" and "dictionary" interchangeably in Common principles.
I don't see either used elsewhere, so I don't think there's a strong precedent. I also don't think it's correct to say that "object" is more general than "dictionary", and it corresponds to a JSON notion, while "dictionary" is a Python name (although the JSON docs associate the two, presumably to help orient Python programmers), so this is a question of style, not of semantics.
Some values are numeric, so this would definitely break backwards compatibility. This level of specification feels like it's headed in an unnecessarily restrictive direction. "Object of objects"/"dictionary of dictionaries" makes the depth clear; a programmer knows they can make a two-level nested loop over entries, and get some value. It is not our job to say what all those values can or must be. If we want to be more specific about what's found in the value of certain keys, then we can specify fields using dot notation, as in Derived dataset and pipeline description: | Field | Description |
|----------------------|-------------|
| TopField.SecondLevel | ... | |
I just wanted to prevent people from hacking into this and providing the filter parameters through arbitrary nesting. The JSON schema gives straightforward ways to do this without adding any complexity to the validator code. But I'm not sure how this would work with the dot notation. Anyway, I recognize @sappelhoff just intended this PR as a clarification. So, it's fine by me as it is. |
I see your point! Perhaps we can make this as a separate improvement:
However, as said above - I would see that as separate from this PR. Can we have some more approving or challenging voices? |
Looks like everybody's happy with this. Merging. |
closes #339
In this PR, that for SoftwareFilters and HardwareFilters we expect an object of objects.
Previously, the text said we expect "a list" ... although the text thereafter and the included example showed an object of objects.
The alternative was to allow "lists of objects", which might have been a more elegant solution that would have allowed to naturally specify an order of applied temporal filters. However, as this would break backward compatibility, I vote against that change.
With the proposed object of objects, an order of filters (if required) can be indicated with additional, custom, key value pairs.