Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] BEP044 - Stim-BIDS #2022

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
95ebf75
Add stimuli specifications to BIDS
neuromechanist Dec 14, 2024
91df78b
edit stimuli description
neuromechanist Dec 15, 2024
0167a41
Update task events to include stim-bids spec
neuromechanist Dec 19, 2024
3c2357b
Add missing entities
neuromechanist Dec 21, 2024
b7b19df
handling GH Action errors
neuromechanist Dec 21, 2024
aa570ac
yaml-link issues
neuromechanist Dec 21, 2024
162cd1c
remark validation
neuromechanist Dec 21, 2024
1ef76bc
Move extensions to their own place.
neuromechanist Dec 21, 2024
e84cac4
Revise the extensions
neuromechanist Dec 22, 2024
1dd9083
YAML lint
neuromechanist Dec 22, 2024
50ad1dd
Merge pull request #2 from neuromechanist/add-stimuli-specifications
neuromechanist Dec 22, 2024
77f33b0
Merge branch 'bids-standard:master' into master
neuromechanist Dec 22, 2024
4978552
Sidecar description is only needed of rhte individual stimulus files.
neuromechanist Dec 31, 2024
80d3077
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 31, 2024
a335ebf
tabular information matching the spec.
neuromechanist Dec 31, 2024
f370c95
Edit stimuli.md to reflect the spec more completely
neuromechanist Dec 31, 2024
330b69f
Implement Kay's comments for clarity
neuromechanist Dec 31, 2024
ad24800
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 31, 2024
a5acd54
Revert adding BEP044 doc, implementing Kay's suggestions.
neuromechanist Dec 31, 2024
238e645
Merge remote-tracking branch 'upstream/master' into pr/neuromechanist…
Remi-Gau Jan 9, 2025
074f5fb
sty
Remi-Gau Jan 9, 2025
f197bba
remark
Remi-Gau Jan 9, 2025
02c6f31
fix
Remi-Gau Jan 9, 2025
3ae9fe1
fix
Remi-Gau Jan 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 159 additions & 0 deletions src/modality-specific-files/stimuli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Stimuli

## Stimulus Files Organization

Stimulus files MUST be stored in the `/stimuli` directory under the root directory of the dataset.
The `/stimuli` directory can contain subdirectories to organize the stimulus files.
Stimulus files MUST follow the BIDS naming conventions and SHOULD be referenced in the `events.tsv`
neuromechanist marked this conversation as resolved.
Show resolved Hide resolved
file using the `stim_id` column.

The standardization of stimulus files and their annotations within BIDS offers several key benefits:

1. **Consistency**: Ensures uniform storage and referencing across datasets
2. **Reusability**: Enables stimulus reuse across studies through standardized structure
3. **Efficiency**: Minimizes redundancy by centralizing annotations
4. **Flexibility**: Facilitates dataset reuse with alternative annotations

To preserve backward compatibility with existing datasets (see the Legacy section below), the use of these specifications for `/stimuli` directory and the `stim_id` column in the `events.tsv` files is RECOMMENDED but not required. Researchers are encouraged to follow these guidelines to enhance the interoperability and reproducibility of their studies.

Following these guidelines will help ensure that stimulus files and their annotations are stored and referenced consistently across different datasets, facilitating data sharing, reuse, and reproducibility.

## File Organization

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"stimuli": {
"stimuli.tsv": "",
"stimuli.json": "",
"[stim-<label>[_part-<label>]<suffix>.<extension>]": "",
"[stim-<label>[_part-<label>]<suffix>.json]": "",
"[[stim-<label>_]annotations.tsv]": "",
"[[stim-<label>_]annotations.json]": "",
"[stim-<label>[_part-<label>]_annot-<label>_events.tsv]": "",
"[stim-<label>[_part-<label>]_annot-<label>_events.json]": ""
}
}) }}

Note: The presence of `stimuli.tsv` file indicates that the content of the `/stimuli` folder follows this BIDS specification for stimulus organization. This structure is planned to become mandatory in BIDS 2.0.

### Stimulus File Formats

The following table lists the supported stimulus file formats and their corresponding suffixes. The suffixes are used to identify the type of stimulus file and are appended to the `stim-<label>` prefix in the file name.

| **suffix** | **extensions** | **description** |
|-------------|-------------------------|-----------------------------------|
| audio | `.wav`, `.mp3`, `.aac`, `.ogg` | Audio-only stimulus files |
| image | `.jpg`, `.png`, `.svg` | Static visual stimulus files |
| video | `.mp4`, `.avi`, `.mkv`, `.webm` | Video-only stimulus files |
| audiovideo | `.mp4`, `.avi`, `.mkv`, `.webm` | Combined audio-visual files |

## Stimulus description (`stim-<label>_<suffix>.json`)
The `stim-<label>_<suffix>.json` file provides metadata about the *singular* stimulus file.
The following fields are defined to describe the stimulus file:

<!-- This block generates a metadata table.
These tables are defined in
src/schema/rules/sidecars
The definitions of the fields specified in these tables may be found in
src/schema/objects/metadata.yaml
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_sidecar_table("stimulus.Stimulus") }}

In some cases, such as observing the copyright of a stimulus file, the actual stimulus file may not be shared. In such cases, the `stim-<label>_<suffix>.json` file SHOULD be used to provide metadata about the stimulus file, including the license, copyright, URL, and description.
neuromechanist marked this conversation as resolved.
Show resolved Hide resolved

### Example `stim-<label>_<suffix>.json`

```JSON
{
"License": "CC-BY-4.0",
"Copyright": "Lab 2023",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to clarify what to state here and whether it should include year , and overall format

may be we should follow https://reuse.software/tutorial/#step-2 and SPDX (reuse uses it too) for license definitions.

"URL": "https://example.com/stimuli/",
"Description": "Collection of face images, tones, and movie clips used in the experiment"
}
```

## Stimuli Description (`stimuli.tsv`)

The `stimuli.tsv/json` files are used to provide information about the stimuli based on their `stim_id`. This file is similar in usage as `participants.tsv`, `scans.tsv` and `sessions.tsv`, which list descriptions about subjects, scans and sessions, respectively. The `stimluli.tsv/json` files MUST be placed in the `/stimuli` directory.
neuromechanist marked this conversation as resolved.
Show resolved Hide resolved

The `stimuli.tsv` file contains information about each stimulus, including stimulus ID, type, URL, and other relevant details. The following table describes the REQUIRED, RECOMMENDED, and OPTIONAL columns for the `stimuli.tsv` file:

<!-- This block generates a columns table.
The definitions of these fields can be found in
src/schema/rules/tabular_data/*.yaml
and a guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_columns_table("stimuli.Stimuli") }}

### Example `stimuli.tsv`

```Text
stimulus_id type URL license copyright description present
stim-face01 image https://example.com/faces/face01.jpg CC-BY-4.0 Lab 2023 A female face with neutral expression true
stim-tone01 audio https://example.com/tones/tone01.wav CC-BY-4.0 Lab 2023 A 440Hz pure tone true
stim-movie01 video https://example.com/movies/movie01.mp4 n/a Studio XYZ A clip from copyrighted movie false
```

The `stimuli.json` file provides detailed descriptions of the columns in the `stimuli.tsv` file. There MAY be extra entries in the `stimuli.json` in addition to the columns in the `stimuli.tsv` to provide more details about the stimulus.

In cases where the stimulus is not shared, the `stimuli.tsv` file can be used to provide metadata about the stimuli, including the license, copyright, URL, and description. This is simialr to the use of `stim-<label>_<suffix>.json` files for individual stimuli files. In the case of conflict between the metadata in the `stimuli.tsv` and `stim-<label>_<suffix>.json` files, the metadata in the `stim-<label>_<suffix>.json` file takes precedence.

Check failure on line 105 in src/modality-specific-files/stimuli.md

View workflow job for this annotation

GitHub Actions / codespell

simialr ==> similar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, here _. It should be consistent. The _ appears in several places as does without the underbar. I'm not going to mark them further --- just need to be consistent.


## Stimulus Annotations
Annotations of the still images or general description of the stimuli (such as frequency and duration of a beep sound) can be stored in the `stimuli.tsv` as an additional column or `stim-<label>_<suffix>.json` as described above. Here is an example of how annotations can be stored in the `stimuli.tsv` file for an image from the Natural Scene Dataset (NSD):

| stimulus_id | type | description | HED | NSD_id | COCO_id |
|--------------|-------|-------------|-----|--------|---------|
| stim-nsd02951 | image | an open market full of people and piles of vegetables | ((Item-count, High), Ingestible-object), (Background-view, ((Human, Body, Agent-trait/Adult), Outdoors, Furnishing, Natural-feature/Sky, Urban, Man-made-object)) | 2951 | 262145 |

However, for time-varying stimuli, such as audio or video, it is RECOMMENDED to use specific annotations files in the form of `stim-<label>_annot-<label>_events.tsv/json` to store the annotations. These files have the same structure as the `events.tsv/json` files and are used to store annotations for the stimuli. There can be multiple annotation files for a single stimulus file, each with a unique annotation label. The annotation files MUST be stored in the `/stimuli` directory.
neuromechanist marked this conversation as resolved.
Show resolved Hide resolved

## Annotation Description (`annotations.tsv`)

The `annotations.tsv` file contains additional metadata about stimulus annotations. There MAY be a single `annotations.tsv` file for all the stimuli or separate `stim-<label>_annotations.tsv` files for each stimulus.
The following columns are defined for the `annotations.tsv` file:

<!-- This block generates a columns table.
The definitions of these fields can be found in
src/schema/rules/tabular_data/*.yaml
and a guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_columns_table("stimuli.Annotations") }}

### Example `*_annotations.tsv`

```Text
annot_id description
face01_emo Emotion annotation for face01 stimulus
face01_gen Gender annotation for face01 stimulus
face01_age Age group annotation for face01 stimulus
```

## Referencing Stimulus Identifiers in `events.tsv`

To reference stimulus identifiers in the `events.tsv` file, use the `stim_id` column. The values in the `stim_id` column should represent unique identifiers for the stimuli. Stimulus ID (`stim_id`) should correspond to the unique identifier of the stimulus file in the /stimuli directory and expands to all files (both stimulus and annotation files) that share the same stimulus ID.

Example `events.tsv` file:

| onset | duration | trial_type | response_time | stim_id |
|-------|----------|------------|---------------|---------|
| 1.23 | 0.65 | start | 1.435 | `stim-<label>` |
| 5.65 | 0.65 | stop | 1.739 | `stim-<label>` |
| 12.1 | 2.35 | n/a | n/a | `stim-<label>` |

In the accompanying JSON sidecar, the `stim_id` column might be described as follows:

```JSON
{
"stim_id": {
"LongName": "Stimulus identifier",
"Description": "Represents a unique identifier for the stimulus presented at the given onset time."
}
}
```
115 changes: 109 additions & 6 deletions src/modality-specific-files/task-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,15 +269,15 @@ and a guide for using macros can be found at

The operating system description SHOULD include the following attributes:

- type (for example, Windows, macOS, Linux)
- distribution (if applicable, for example, Ubuntu, Debian, CentOS)
- the version number (for example, 18.04.5)
- type (for example, Windows, macOS, Linux)
- distribution (if applicable, for example, Ubuntu, Debian, CentOS)
- the version number (for example, 18.04.5)

Examples:

- Windows 10, Version 2004
- macOS 10.15.6
- Linux Ubuntu 18.04.5
- Windows 10, Version 2004
- macOS 10.15.6
- Linux Ubuntu 18.04.5

The amount of information supplied for the `OperatingSystem` SHOULD be sufficient
to re-run the code under maximally similar conditions.
Expand Down Expand Up @@ -400,3 +400,106 @@ A guide for using macros can be found at
Additional metadata may be included as in
[any TSV file](../common-principles.md#tabular-files) to specify, for
example, the units of the recorded time series for each column.

## Standardization of Stimulus Files and Annotations

To ensure consistency and facilitate reuse, the BIDS specifications provide guidelines for standardizing stimulus files and their annotations. This section outlines the recommended practices for storing and referencing stimulus files within a BIDS dataset.

### Storing Stimulus Files

Stimulus files should be stored in the `/stimuli` directory under the root directory of the dataset. The `/stimuli` directory can contain subdirectories to organize the stimulus files. There are no restrictions on the file formats of the stimulus files.

Example directory structure:

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"stimuli": {
"images": {
"cat01.jpg": "",
"cat02.jpg": "",
},
"videos": {
"movie01.mp4": "",
"movie02.mp4": "",
},
},
}) }}

### Referencing Stimulus Files in `events.tsv`

To reference stimulus files in the `events.tsv` file, use the `stim_file` column. The values in the `stim_file` column should represent the relative path to the stimulus file within the `/stimuli` directory.

Example `events.tsv` file:

```Text
onset duration trial_type response_time stim_file
1.23 0.65 start 1.435 images/cat01.jpg
5.65 0.65 stop 1.739 images/cat02.jpg
12.1 2.35 n/a n/a videos/movie01.mp4
```

In the accompanying JSON sidecar, the `stim_file` column might be described as follows:

```JSON
{
"stim_file": {
"LongName": "Stimulus file",
"Description": "Represents the location of the stimulus file (such as an image, video, or audio file) presented at the given onset time. The values correspond to a path relative to the /stimuli directory."
}
}
```

### Referencing Stimulus Identifiers in `events.tsv`

To reference stimulus identifiers in the `events.tsv` file, use the `stim_id` column. The `stim_id` corresponds to the unique identifier of the stimulus files and their annotations stored under the `/stimuli` directory. This allows linking each event to multiple related files associated with that stimulus.

Example `events.tsv` file:

```Text
onset duration trial_type response_time stim_id
1.23 0.65 start 1.435 stim-face01
5.65 0.65 stop 1.739 stim-face02
12.1 2.35 n/a n/a stim-video01
```

In the accompanying JSON sidecar, the `stim_id` column might be described as follows:

```JSON
{
"stim_id": {
"LongName": "Stimulus identifier",
"Description": "Represents a unique identifier for the stimulus presented at the given onset time. Links to files and annotations in the /stimuli directory."
}
}
```

The `stim_id` in the events file links to corresponding files:

```Text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs MACROS___make_filetree_example until someone smart implements parsing... ref: #2014 (comment)

stimuli/
├── stim-face01_image.jpg
├── stim-face01_image.json
├── stim-face01_annotations.tsv
├── stim-face02_image.jpg
├── stim-face02_image.json
├── stim-face02_annotations.tsv
├── stim-face02_annotations.tsv
├── stimuli.tsv
└── stimuli.json
```

By using `stim_id`, multiple annotations and stimulus files associated with the same identifier can be efficiently linked to events in the `events.tsv` file. The `stim_id` is a unique identifier for the stimulus that can be used to reference the stimulus files, annotations, and metadata stored in `stimuli.tsv` and `stimuli.json`. For more information on the structure of stimulus files and annotations, refer to the [Stimuli](./stimuli.md) BIDS specifications.

### Advantages of Standardization

Standardizing stimulus files and their annotations within the BIDS specifications offers several advantages:

1. **Consistency**: Ensures that stimulus files are stored and referenced in a consistent manner across different datasets.
2. **Reusability**: Facilitates the reuse of stimulus files and annotations in other studies by providing a standardized structure.
3. **Efficiency**: Reduces redundancy by avoiding the need to replicate annotations across subjects, modalities, tasks, and runs.
4. **Flexibility**: Allows for easy modification of annotations by updating a single file, enabling the reuse of datasets with alternative annotations.

By following these guidelines, researchers can enhance the interoperability and reproducibility of their studies, making it easier to share and reuse data within the scientific community.
9 changes: 9 additions & 0 deletions src/schema/objects/columns.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -585,6 +585,15 @@ stim_file:
For example `images/cat03.jpg` will be translated to `/stimuli/images/cat03.jpg`.
type: string
format: stimuli_relative
stim_id:
name: stim_id
display_name: Stimulus identifier
description: |
Represents a unique identifier for the stimulus presented at the given onset
time. The `stim_id` is inclusive of the stimulus file(s), annotations
related to the stimulus, and the information about the stimulus present in
the `stimuli.tsv` file.
type: string
strain:
name: strain
display_name: Strain
Expand Down
43 changes: 29 additions & 14 deletions src/schema/objects/entities.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -194,20 +194,12 @@ part:
name: part
display_name: Part
description: |
This entity is used to indicate which component of the complex
representation of the MRI signal is represented in voxel data.
The `part-<label>` entity is associated with the DICOM Tag
`0008, 9208`.
Allowed label values for this entity are `phase`, `mag`, `real` and `imag`,
which are typically used in `part-mag`/`part-phase` or
`part-real`/`part-imag` pairs of files.

Phase images MAY be in radians or in arbitrary units.
The sidecar JSON file MUST include the `"Units"` of the `phase` image.
The possible options are `"rad"` or `"arbitrary"`.

When there is only a magnitude image of a given type, the `part` entity MAY be
omitted.
This entity is used to indicate which component of a complex
representation is being stored. For MRI data, it indicates which component
of the complex signal is represented in voxel data. For stimulus files, it can
be used to distinguish different parts of a single stimulus, such as chapters
in an audiobook or segments of a long movie (for example, `part-1`, `part-2`,
`part-epilog`, `part-chapter1`).
Comment on lines +197 to +202
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably do not want to erase all the details from the previous definition

can we have several definition for an entity : mean different things depending on the datatype or suffix... ?

type: string
format: label
enum:
Expand Down Expand Up @@ -373,6 +365,19 @@ stain:
and/or `"SampleSecondaryAntibodies"` metadata fields, as appropriate.
type: string
format: label
stimulus:
name: stim
display_name: Stimulus
description: |
The `stim-<label>` entity can be used to distinguish different stimulus files
or annotations. The label is a unique identifier for the stimulus or annotation.

This entity represents the `"Stimulus"` metadata field and requires corresponding
entries in the `stimuli.tsv` file. Therefore, if the `stim-<label>` entity is
present in a filename, `"Stimulus"` MUST be defined in the associated metadata,
and a matching entry MUST exist in the `stimuli.tsv` file.
type: string
format: label
subject:
name: sub
display_name: Subject
Expand Down Expand Up @@ -443,3 +448,13 @@ tracksys:
may be longer and more human readable.
type: string
format: label
annotation:
name: annot
display_name: Annotation
description: |
The `annot-<label>` entity accommodates multiple annotations for a single
(usually, but not necessarily, time-varying) stimulus id. Similar to `stimuli.tsv`,
there can be one or multiple `annotations.tsv` files with `annotation_id`, providing
a list of the annotations in the directory, or for a specific stimulus respectively.
type: string
format: label
Loading
Loading