Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] BEP044 - Stim-BIDS #2022

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
95ebf75
Add stimuli specifications to BIDS
neuromechanist Dec 14, 2024
91df78b
edit stimuli description
neuromechanist Dec 15, 2024
0167a41
Update task events to include stim-bids spec
neuromechanist Dec 19, 2024
3c2357b
Add missing entities
neuromechanist Dec 21, 2024
b7b19df
handling GH Action errors
neuromechanist Dec 21, 2024
aa570ac
yaml-link issues
neuromechanist Dec 21, 2024
162cd1c
remark validation
neuromechanist Dec 21, 2024
1ef76bc
Move extensions to their own place.
neuromechanist Dec 21, 2024
e84cac4
Revise the extensions
neuromechanist Dec 22, 2024
1dd9083
YAML lint
neuromechanist Dec 22, 2024
50ad1dd
Merge pull request #2 from neuromechanist/add-stimuli-specifications
neuromechanist Dec 22, 2024
77f33b0
Merge branch 'bids-standard:master' into master
neuromechanist Dec 22, 2024
4978552
Sidecar description is only needed of rhte individual stimulus files.
neuromechanist Dec 31, 2024
80d3077
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 31, 2024
a335ebf
tabular information matching the spec.
neuromechanist Dec 31, 2024
f370c95
Edit stimuli.md to reflect the spec more completely
neuromechanist Dec 31, 2024
330b69f
Implement Kay's comments for clarity
neuromechanist Dec 31, 2024
ad24800
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 31, 2024
a5acd54
Revert adding BEP044 doc, implementing Kay's suggestions.
neuromechanist Dec 31, 2024
238e645
Merge remote-tracking branch 'upstream/master' into pr/neuromechanist…
Remi-Gau Jan 9, 2025
074f5fb
sty
Remi-Gau Jan 9, 2025
f197bba
remark
Remi-Gau Jan 9, 2025
02c6f31
fix
Remi-Gau Jan 9, 2025
3ae9fe1
fix
Remi-Gau Jan 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ nav:
- Near-Infrared Spectroscopy: modality-specific-files/near-infrared-spectroscopy.md
- Motion: modality-specific-files/motion.md
- Magnetic Resonance Spectroscopy: modality-specific-files/magnetic-resonance-spectroscopy.md
- Stimuli: modality-specific-files/stimuli.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it there for now as the file is in modality specific folder, but I think this stimuli BEP should be 'modality agnostic".

@oesteban

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree @Remi-Gau

Copy link
Member Author

@neuromechanist neuromechanist Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1.
IMHO, the same goes for (task) events. Hopefully, if stimuli moves to the modality agnostic section, the events will also move.

- Derivatives:
- BIDS Derivatives: derivatives/introduction.md
- Common data types and metadata: derivatives/common-data-types.md
Expand Down
161 changes: 161 additions & 0 deletions src/modality-specific-files/stimuli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# Stimuli

## Stimulus Files Organization

Stimulus files MUST be stored in the `/stimuli` directory under the root directory of the dataset.
The `/stimuli` directory can contain subdirectories to organize the stimulus files.
Stimulus files MUST follow the BIDS naming conventions and are referenced in the `events.tsv`
file using the `stim_id` column.

The standardization of stimulus files and their annotations within BIDS offers several key benefits:

1. **Consistency**: Ensures uniform storage and referencing across datasets
1. **Reusability**: Enables stimulus reuse across studies through standardized structure
1. **Efficiency**: Minimizes redundancy by centralizing annotations
1. **Flexibility**: Facilitates dataset reuse with alternative annotations

To preserve backward compatibility with existing datasets (see the Legacy section below), the use of these specifications for `/stimuli` directory and the `stim_id` column in the `events.tsv` files is RECOMMENDED but not required. Researchers are encouraged to follow these guidelines to enhance the interoperability and reproducibility of their studies.

Following these guidelines will help ensure that stimulus files and their annotations are stored and referenced consistently across different datasets, facilitating data sharing, reuse, and reproducibility.

## File Organization

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"stimuli": {
"stimuli.tsv": "",
"stimuli.json": "",
"[stim-<label>[_part-<label>]_<suffix>.<extension>]": "",
"[stim-<label>[_part-<label>]_<suffix>.json]": "",
"[[stim-<label>_]annotations.tsv]": "",
"[[stim-<label>_]annotations.json]": "",
"[stim-<label>[_part-<label>]_annot-<label>_events.tsv]": "",
"[stim-<label>[_part-<label>]_annot-<label>_events.json]": ""
}
}) }}

Note: The presence of `stimuli.tsv` file indicates that the content of the `/stimuli` directory follows this BIDS specification for stimulus organization. This structure is planned to become mandatory in BIDS 2.0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a first time we have something like this? I believe we do not even have similar conditioning for derivaties/ yet... I just wonder if validator is "ready" or what needs tobe done? WDYT @effigies ?


### Stimulus File Formats

The following table lists the supported stimulus file formats and their corresponding suffixes. The suffixes are used to identify the type of stimulus file and are appended to the `stim-<label>` prefix in the file name.

| suffix | extensions | description |
| ----------- | ------------------------------- | ---------------------------- |
| audio | `.wav`, `.mp3`, `.aac`, `.ogg` | Audio-only stimulus files |
| image | `.jpg`, `.png`, `.svg` | Static visual stimulus files |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do include .webm for video below and there is an increasing use of the new https://en.wikipedia.org/wiki/WebP format as "the best" of both jpg and png since provides composition of both words

  • supports lossy and lossless compression
  • supports transparency (alpha channel), not only for lossless like in png
  • supports animation

so I would expect studies to start using .

But may be it is premature since ATM I found no single .webp file among openneuro datasets.

| video | `.mp4`, `.avi`, `.mkv`, `.webm` | Video-only stimulus files |
| audiovideo | `.mp4`, `.avi`, `.mkv`, `.webm` | Combined audio-visual files |

## Stimulus description (`stim-<label>_<suffix>.json`)

The `stim-<label>_<suffix>.json` file provides metadata about the _singular_ stimulus file.
The following fields are defined to describe the stimulus file:

<!-- This block generates a metadata table.
These tables are defined in
src/schema/rules/sidecars
The definitions of the fields specified in these tables may be found in
src/schema/objects/metadata.yaml
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_sidecar_table("stimuli.Stimuli") }}

In some cases, such as observing the copyright of a stimulus file, the actual stimulus file may not be shared. In such cases, the `stim-<label>_<suffix>.json` file SHOULD be used to provide metadata about the stimulus file, including the license, copyright, URL, and description.
neuromechanist marked this conversation as resolved.
Show resolved Hide resolved

### Example `stim-<label>_<suffix>.json`

```JSON
{
"License": "CC-BY-4.0",
"Copyright": "Lab 2023",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to clarify what to state here and whether it should include year , and overall format

may be we should follow https://reuse.software/tutorial/#step-2 and SPDX (reuse uses it too) for license definitions.

"URL": "https://example.com/stimuli/",
"Description": "Collection of face images, tones, and movie clips used in the experiment"
}
```

## Stimuli Description (`stimuli.tsv`)

The `stimuli.tsv` files are used to provide information about the stimuli based on their `stim_id`. This file is similar in usage as `participants.tsv`, `scans.tsv` and `sessions.tsv`, which list descriptions about subjects, scans and sessions, respectively. The `stimluli.tsv` files MUST be placed in the `/stimuli` directory.

The `stimuli.tsv` file contains information about each stimulus, including stimulus ID, type, URL, and other relevant details. The following table describes the REQUIRED, RECOMMENDED, and OPTIONAL columns for the `stimuli.tsv` file:

<!-- This block generates a columns table.
The definitions of these fields can be found in
src/schema/rules/tabular_data/*.yaml
and a guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_columns_table("stimuli.Stimuli") }}

### Example `stimuli.tsv`

```Text
stimulus_id type URL license copyright description present
stim-face01 image https://example.com/faces/face01.jpg CC-BY-4.0 Lab 2023 A female face with neutral expression true
stim-tone01 audio https://example.com/tones/tone01.wav CC-BY-4.0 Lab 2023 A 440Hz pure tone true
stim-movie01 video https://example.com/movies/movie01.mp4 n/a Studio XYZ A clip from copyrighted movie false
```

The `stimuli.json` file provides detailed descriptions of the columns in the `stimuli.tsv` file. There MAY be extra entries in the `stimuli.json` in addition to the columns in the `stimuli.tsv` to provide more details about the stimulus.

In cases where the stimulus is not shared, the `stimuli.tsv` file can be used to provide metadata about the stimuli, including the license, copyright, URL, and description. This is similar to the use of `stim-<label>_<suffix>.json` files for individual stimuli files. In the case of conflict between the metadata in the `stimuli.tsv` and `stim-<label>_<suffix>.json` files, the metadata in the `stim-<label>_<suffix>.json` file takes precedence.

## Stimulus Annotations

Annotations of the still images or general description of the stimuli (such as frequency and duration of a beep sound) can be stored in the `stimuli.tsv` as an additional column or `stim-<label>_<suffix>.json` as described above. Here is an example of how annotations can be stored in the `stimuli.tsv` file for an image from the Natural Scene Dataset (NSD):

| stimulus_id | type | description | HED | NSD_id | COCO_id |
| ------------- | ----- | ----------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ | ------- |
| stim-nsd02951 | image | an open market full of people and piles of vegetables | ((Item-count, High), Ingestible-object), (Background-view, ((Human, Body, Agent-trait/Adult), Outdoors, Furnishing, Natural-feature/Sky, Urban, Man-made-object)) | 2951 | 262145 |

However, for time-varying stimuli, such as audio or video, it is RECOMMENDED to use specific annotations files in the form of `stim-<label>_annot-<label>_events.tsv` to store the annotations. These files have the same structure as the `events.tsv` files and are used to store annotations for the stimuli. There can be multiple annotation files for a single stimulus file, each with a unique annotation label. The annotation files MUST be stored in the `/stimuli` directory.

## Annotation Description (`annotations.tsv`)

The `annotations.tsv` file contains additional metadata about stimulus annotations. There MAY be a single `annotations.tsv` file for all the stimuli or separate `stim-<label>_annotations.tsv` files for each stimulus.
The following columns are defined for the `annotations.tsv` file:

<!-- This block generates a columns table.
The definitions of these fields can be found in
src/schema/rules/tabular_data/*.yaml
and a guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_columns_table("stimuli.Annotations") }}

### Example `*_annotations.tsv`

```Text
annot_id description
face01_emo Emotion annotation for face01 stimulus
face01_gen Gender annotation for face01 stimulus
face01_age Age group annotation for face01 stimulus
```

## Referencing Stimulus Identifiers in `events.tsv`

To reference stimulus identifiers in the `events.tsv` file, use the `stim_id` column. The values in the `stim_id` column should represent unique identifiers for the stimuli. Stimulus ID (`stim_id`) should correspond to the unique identifier of the stimulus file in the /stimuli directory and expands to all files (both stimulus and annotation files) that share the same stimulus ID.

Example `events.tsv` file:

| onset | duration | trial_type | response_time | stim_id |
| ----- | -------- | ---------- | ------------- | -------------- |
| 1.23 | 0.65 | start | 1.435 | `stim-<label>` |
| 5.65 | 0.65 | stop | 1.739 | `stim-<label>` |
| 12.1 | 2.35 | n/a | n/a | `stim-<label>` |

In the accompanying JSON sidecar, the `stim_id` column might be described as follows:

```JSON
{
"stim_id": {
"LongName": "Stimulus identifier",
"Description": "Represents a unique identifier for the stimulus presented at the given onset time."
}
}
```
103 changes: 103 additions & 0 deletions src/modality-specific-files/task-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -400,3 +400,106 @@ A guide for using macros can be found at
Additional metadata may be included as in
[any TSV file](../common-principles.md#tabular-files) to specify, for
example, the units of the recorded time series for each column.

## Standardization of Stimulus Files and Annotations

To ensure consistency and facilitate reuse, the BIDS specifications provide guidelines for standardizing stimulus files and their annotations. This section outlines the recommended practices for storing and referencing stimulus files within a BIDS dataset.

### Storing Stimulus Files

Stimulus files should be stored in the `/stimuli` directory under the root directory of the dataset. The `/stimuli` directory can contain subdirectories to organize the stimulus files. There are no restrictions on the file formats of the stimulus files.

Example directory structure:

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"stimuli": {
"images": {
"cat01.jpg": "",
"cat02.jpg": "",
},
"videos": {
"movie01.mp4": "",
"movie02.mp4": "",
},
},
}) }}

### Referencing Stimulus Files in `events.tsv`

To reference stimulus files in the `events.tsv` file, use the `stim_file` column. The values in the `stim_file` column should represent the relative path to the stimulus file within the `/stimuli` directory.

Example `events.tsv` file:

```Text
onset duration trial_type response_time stim_file
1.23 0.65 start 1.435 images/cat01.jpg
5.65 0.65 stop 1.739 images/cat02.jpg
12.1 2.35 n/a n/a videos/movie01.mp4
```

In the accompanying JSON sidecar, the `stim_file` column might be described as follows:

```JSON
{
"stim_file": {
"LongName": "Stimulus file",
"Description": "Represents the location of the stimulus file (such as an image, video, or audio file) presented at the given onset time. The values correspond to a path relative to the /stimuli directory."
}
}
```

### Referencing Stimulus Identifiers in `events.tsv`

To reference stimulus identifiers in the `events.tsv` file, use the `stim_id` column. The `stim_id` corresponds to the unique identifier of the stimulus files and their annotations stored under the `/stimuli` directory. This allows linking each event to multiple related files associated with that stimulus.

Example `events.tsv` file:

```Text
onset duration trial_type response_time stim_id
1.23 0.65 start 1.435 stim-face01
5.65 0.65 stop 1.739 stim-face02
12.1 2.35 n/a n/a stim-video01
```

In the accompanying JSON sidecar, the `stim_id` column might be described as follows:

```JSON
{
"stim_id": {
"LongName": "Stimulus identifier",
"Description": "Represents a unique identifier for the stimulus presented at the given onset time. Links to files and annotations in the /stimuli directory."
}
}
```

The `stim_id` in the events file links to corresponding files:

```Text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs MACROS___make_filetree_example until someone smart implements parsing... ref: #2014 (comment)

stimuli/
├── stim-face01_image.jpg
├── stim-face01_image.json
├── stim-face01_annotations.tsv
├── stim-face02_image.jpg
├── stim-face02_image.json
├── stim-face02_annotations.tsv
├── stim-face02_annotations.tsv
├── stimuli.tsv
└── stimuli.json
```

By using `stim_id`, multiple annotations and stimulus files associated with the same identifier can be efficiently linked to events in the `events.tsv` file. The `stim_id` is a unique identifier for the stimulus that can be used to reference the stimulus files, annotations, and metadata stored in `stimuli.tsv` and `stimuli.json`. For more information on the structure of stimulus files and annotations, refer to the [Stimuli](./stimuli.md) BIDS specifications.

### Advantages of Standardization

Standardizing stimulus files and their annotations within the BIDS specifications offers several advantages:

1. **Consistency**: Ensures that stimulus files are stored and referenced in a consistent manner across different datasets.
1. **Reusability**: Facilitates the reuse of stimulus files and annotations in other studies by providing a standardized structure.
1. **Efficiency**: Reduces redundancy by avoiding the need to replicate annotations across subjects, modalities, tasks, and runs.
1. **Flexibility**: Allows for easy modification of annotations by updating a single file, enabling the reuse of datasets with alternative annotations.

By following these guidelines, researchers can enhance the interoperability and reproducibility of their studies, making it easier to share and reuse data within the scientific community.
9 changes: 9 additions & 0 deletions src/schema/objects/columns.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -585,6 +585,15 @@ stim_file:
For example `images/cat03.jpg` will be translated to `/stimuli/images/cat03.jpg`.
type: string
format: stimuli_relative
stim_id:
name: stim_id
display_name: Stimulus identifier
description: |
Represents a unique identifier for the stimulus presented at the given onset
time. The `stim_id` is inclusive of the stimulus file(s), annotations
related to the stimulus, and the information about the stimulus present in
the `stimuli.tsv` file.
type: string
strain:
name: strain
display_name: Strain
Expand Down
4 changes: 4 additions & 0 deletions src/schema/objects/datatypes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,7 @@ nirs:
value: nirs
display_name: Near-Infrared Spectroscopy
description: Near-Infrared Spectroscopy data organized around the SNIRF format
stimuli:
value: stimuli
display_name: Stimulus
description: Stimulus
43 changes: 29 additions & 14 deletions src/schema/objects/entities.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -194,20 +194,12 @@ part:
name: part
display_name: Part
description: |
This entity is used to indicate which component of the complex
representation of the MRI signal is represented in voxel data.
The `part-<label>` entity is associated with the DICOM Tag
`0008, 9208`.
Allowed label values for this entity are `phase`, `mag`, `real` and `imag`,
which are typically used in `part-mag`/`part-phase` or
`part-real`/`part-imag` pairs of files.

Phase images MAY be in radians or in arbitrary units.
The sidecar JSON file MUST include the `"Units"` of the `phase` image.
The possible options are `"rad"` or `"arbitrary"`.

When there is only a magnitude image of a given type, the `part` entity MAY be
omitted.
This entity is used to indicate which component of a complex
representation is being stored. For MRI data, it indicates which component
of the complex signal is represented in voxel data. For stimulus files, it can
be used to distinguish different parts of a single stimulus, such as chapters
in an audiobook or segments of a long movie (for example, `part-1`, `part-2`,
`part-epilog`, `part-chapter1`).
Comment on lines +197 to +202
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably do not want to erase all the details from the previous definition

can we have several definition for an entity : mean different things depending on the datatype or suffix... ?

type: string
format: label
enum:
Expand Down Expand Up @@ -373,6 +365,19 @@ stain:
and/or `"SampleSecondaryAntibodies"` metadata fields, as appropriate.
type: string
format: label
stimulus:
name: stim
display_name: Stimulus
description: |
The `stim-<label>` entity can be used to distinguish different stimulus files
or annotations. The label is a unique identifier for the stimulus or annotation.

This entity represents the `"Stimulus"` metadata field and requires corresponding
entries in the `stimuli.tsv` file. Therefore, if the `stim-<label>` entity is
present in a filename, `"Stimulus"` MUST be defined in the associated metadata,
and a matching entry MUST exist in the `stimuli.tsv` file.
type: string
format: label
subject:
name: sub
display_name: Subject
Expand Down Expand Up @@ -443,3 +448,13 @@ tracksys:
may be longer and more human readable.
type: string
format: label
annotation:
name: annot
display_name: Annotation
description: |
The `annot-<label>` entity accommodates multiple annotations for a single
(usually, but not necessarily, time-varying) stimulus id. Similar to `stimuli.tsv`,
there can be one or multiple `annotations.tsv` files with `annotation_id`, providing
a list of the annotations in the directory, or for a specific stimulus respectively.
type: string
format: label
15 changes: 15 additions & 0 deletions src/schema/objects/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,11 @@ ContrastBolusIngredient:
# TODO: Add definitions for these values. (perhaps don't specify)
- UNKNOWN
- NONE
Copyright:
name: Copyright
display_name: Copyright
description: Copyright information
type: string
DCOffsetCorrection:
name: DCOffsetCorrection
display_name: DC Offset Correction
Expand Down Expand Up @@ -1892,6 +1897,11 @@ License:
The corresponding full license text MAY be specified in an additional
`LICENSE` file.
type: string
StimulusLicense:
name: License
display_name: License
description: License under which this stimulus is shared.
type: string
LongName:
name: LongName
display_name: Long Name
Expand Down Expand Up @@ -3653,6 +3663,11 @@ SubjectArtefactDescription:
If this field is set to `"n/a"`, it will be interpreted as absence of major
source of artifacts except cardiac and blinks.
type: string
StimulusURL:
name: URL
display_name: URL
description: Location (origin) for the stimulus file.
type: string
TablePosition:
name: TablePosition
display_name: Table Position
Expand Down
Loading
Loading