Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Example1.Rmd #5

Merged
merged 2 commits into from
Apr 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions docs/dataset-qc/Example1.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,24 @@ Repetitive, boring, and/or error-prone tasks should be scripted if possible; don

Scripting can reduce how many errors occur and how quickly they're found (and corrected). For example, if a data file has a typo in the name a script will fail to find it while a person may click the file without noticing the typo. Scripts can also make it faster and easier to rerun an analysis or conversion if needed, such as to change a parameter.

When possible, have scripts read and write files directly from the intended permanent storage locations. For example, we store some files in [box](box.com). These files can be transferred by clicking options in the box web GUI, but this is slow it is easy to accidentally select the wrong file. Instead, we use [boxr](https://github.com/r-box/boxr) functions to retrieve and upload files from box. Box integration examples are not included here, but see [this DMCC example](https://github.com/ccplabwustl/dualmechanisms/blob/master/preparationsAndConversions/eprime/TEMPLATE_convertEprime.R).
When possible, have scripts read and write files directly from the intended permanent storage locations. For example, we store some files in [box](box.com). These files can be transferred by clicking options in the box web GUI, but this is slow it is easy to accidentally select the wrong file. Instead, we use [boxr](https://github.com/r-box/boxr) functions to retrieve and upload files from box. Box integration is not included here, but see [this DMCC example](https://github.com/ccplabwustl/dualmechanisms/blob/master/preparationsAndConversions/eprime/TEMPLATE_convertEprime.R).

## Tutorial: converting eprime files to csv.
### Background
- [Eprime](https://pstnet.com/products/e-prime/) saves its files in a proprietary format and non-human-readable plain text. We convert these to csv as quickly as possible after data collection. (Something I suggest doing for all non-standard file formats, not just eprime; store data in formats like nifti and text whenever possible for long-term accessibility.)
- This task is a prime target for scripting: the conversion must be done often and exactly, and accuracy can be tested algorithmically (e.g., by counting trial types).
- The tutorial eprime files are from a heartbeat-counting task like Pollatos, Traut-Mattausch, and Schandry ([2009](https://doi.org/10.1002/da.20504)). The task starts with a five-minute baseline period, followed by three trials during which the participant is asked to count their heart beats. After each trial participants verbally report how many beats they counted and their confidence in the count. The same trial order is used for all participants: the first is 25 seconds, second 35 seconds, and third 45 seconds.
- For **Dataset QC** we will verify that the three trials, initial baseline, and final rest periods are present, and in the expected order and durations.
[Eprime](https://pstnet.com/products/e-prime/) saves its files in a proprietary format and non-human-readable plain text. We convert these to csv as quickly as possible after data collection. (Something I suggest doing for all non-standard file formats, not just eprime; store data in formats like nifti and text whenever possible for long-term accessibility.) This task is a prime target for scripting: the conversion must be done often and exactly, and accuracy can be tested algorithmically (e.g., by counting trial types).

The tutorial eprime files are from a heartbeat-counting task like Pollatos, Traut-Mattausch, and Schandry ([2009](https://doi.org/10.1002/da.20504)). The task starts with a five-minute baseline period, followed by three trials during which the participant is asked to count their heart beats. After each trial participants verbally report how many beats they counted and their confidence in the count. The same trial order is used for all participants: the first is 25 seconds, second 35 seconds, and third 45 seconds.

In this tutorial we will convert the eprime text recovery files to csv. The code also checks that the three trials, initial baseline, and final rest periods are present, and in the expected order and durations.

```{r}

# setwd("d:/maile/svnFiles/plein/conferences/ISMRM2022/onlineExample1"); # for Jo's local testing

test <- readLines("interoception_demoSub1.txt", warn=FALSE);
test <- readLines("example1files/interoception_demoSub1.txt", warn=FALSE);
print(length(test)); # should be 315

write.table("testing", "test.txt")

```

This script uses the [eMergeR](https://github.com/AWKruijt/eMergeR) R library's functions for parsing information out of the eprime text recovery file. I generally suggest starting each script by loading any needed libraries, clearing R's memory, setting options, and defining needed variables. This first code block loads eMergeR, clears R's workspace, and sets the input and output paths.
Expand Down
18 changes: 8 additions & 10 deletions docs/dataset-qc/Example2.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,12 @@ jupyter:
name: ir
---

# Example 2: Highlight important and diagnostic features as efficiently as possible
## QC reports won't be used if they are too long, ugly, or annoying.
- Aim for short, aesthetically pleasing files that emphasize easy-to-check diagnostic features (e.g., that the volumes "look like brains" and the surfaces have "tiger stripes").
- It often works well to arrange images so can visually survey ("which one of these things is not like the others") and judge typical variability.
- Collect training examples of what the diagnostic features should (or not) look like.
- It is often more effective to investigate oddities with separate, more detailed files and reports when needed, rather than trying to fit all possibly-useful images and statistics into one document.

# Tutorial: fMRI volume (nifti) image plotting
## Background
# Dataset QC Example 2
Highlight important and diagnostic features as efficiently as possible. QC reports won't be used if they are too long, ugly, or annoying. Aim for short, aesthetically pleasing files that emphasize easy-to-check diagnostic features (e.g., that the volumes "look like brains" and the surfaces have "tiger stripes"). It is often more effective to investigate oddities with separate, more detailed files and reports when needed, rather than trying to fit all possibly-useful images and statistics into one document.

It often works well to arrange images so can visually survey ("which one of these things is not like the others") and judge typical variability. Collect examples of what the diagnostic features should (or not) look like.

## Tutorial: fMRI volume (nifti) image plotting
Efficiently displaying QC images depends upon being able to easily access and plot the source files. The tutorial includes basic image reading and plotting, the key foundation skill upon which the more complex files (see links at the end of this page) are built.


Expand All @@ -38,7 +35,8 @@ dim(img); # [1] 81 96 81
max(img); # [1] 1374.128
img[30,20,50]; # value of this voxel

layout(matrix(1:3, c(1,3))); # hopefully have three images in one row
options(repr.plot.width = 8, repr.plot.height = 3); # specify size in jupyter
layout(matrix(1:3, c(1,3))); # have three images in one row
image(img[15,,], col=gray(0:64/64), xlab="", ylab="", axes=FALSE, useRaster=TRUE); # plot slice i=15
image(img[,20,], col=gray(0:64/64), xlab="", ylab="", axes=FALSE, useRaster=TRUE); # plot slice j=20
image(img[,,50], col=gray(0:64/64), xlab="", ylab="", axes=FALSE, useRaster=TRUE); # plot slice k=50
Expand Down
4 changes: 2 additions & 2 deletions docs/dataset-qc/Example3.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ jupyter:
name: ir
---

# Dataset QC Example 3
Establish and continually perform control analyses.

# Example 3: Establish and continually perform control analyses
## Control analyses
Control analyses are for dataset QC; they must be separate from the experimental questions and target analyses. **Positive control analyses** check for the existence of effects that **must** be present if the dataset is valid. If the effects are not detected, we know that something is wrong in the dataset or analysis, and work should not proceed until the issues are resolved.

One of my favorite positive control analyses is **button pressing**:
Expand Down
3 changes: 2 additions & 1 deletion docs/dataset-qc/Example4.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ jupyter:
name: ir
---

# Example 4: Make use of automated dynamic reports
# Dataset QC Example 4
Make use of automated dynamic reports

```{r}

Expand Down