Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics of denoising performance #9

Open
tsalo opened this issue Feb 15, 2019 · 5 comments
Open

Metrics of denoising performance #9

tsalo opened this issue Feb 15, 2019 · 5 comments

Comments

@tsalo
Copy link
Member

tsalo commented Feb 15, 2019

The goal of this analysis is to determine which settings best denoise multi-echo data. We'll need good metrics of this performance, which we can probably take from other papers. Broadly, each of these metrics can be calculated for each functional run for using the single-echo (~30 ms), combined, and denoised data, and the distributions can be compared across strategies.

We can break down our metrics into two groups: removal of noise and preservation of signal.

Removal of noise metrics:

  1. Distance-dependent motion-related artifacts
    • I have a repository where I've attempted to implement Power's analyses. The results aren't always as clear as I'd like, but I think the code works correctly. Here is the link.
    • Interpretation: Failure to eliminate distance-dependent motion-related artifacts indicates poor removal of motion-related noise.
  2. Component classifications for components highly correlated with the following:
    • Motion parameters (should be rejected)
    • CompCor regressors (should be rejected)
    • Interpretation: Ability to detect artifactual or task-related components in the absence of external information would indicate good performance of the component classifier.
  3. Component classifications for components automatically flagged as noise using AROMA.
    • I'm thinking about the edge mask-related metric specifically.
    • The frequency-based metric is less relevant, in my opinion, since one of the benefits to multi-echo is supposed to be that we don't need band-pass filtering.
  4. Component classifications for components visually identified as artifacts.
    • Interpretation: Ability to detect artifactual or task-related components in the absence of manual intervention would indicate good performance of the component classifier.

Preservation of signal metrics:

  1. Power analysis of task data
    • For well-characterized tasks, we can define a priori regions of interest and run power analyses on the model results with fMRIPower.
    • Interpretation: Improved power to detect well-known effects for denoised data compared to combined or single-echo data would indicate improved denoising without removing signal.
  2. TSNR
    • Temporal signal-to-noise ratio values can be compared on a voxel-wise basis.
    • Caveat: TSNR will increase as degrees of freedom decrease, so denoised data will necessarily have higher TSNR even if denoising is bad.
  3. Contrast-to-noise ratio maps
  4. Activation count maps
    • I just really love these maps
    • Do we care about alignment with underlying anatomy (like fMRIPrep) or total voxel count?
  5. Parameter estimates
    • Variability across subjects?
    • Value height?
    • Test statistic height?
  6. Component classifications for components highly correlated with the following:
    • Convolved task regressors (should be accepted)
    • Interpretation: Ability to detect artifactual or task-related components in the absence of external information would indicate good performance of the component classifier.
@dowdlelt
Copy link

Would this all be relative to just the optimally combined data, and perhaps the ~30ms (at 3T) echo? I like all of these. Could also think about the seed connectivity maps, or ICC values.

The task analyses are a critical component, because we have to be sure that tedana isn't removing BOLD-like signals, which it has done in the 'deep' past.

@tsalo
Copy link
Member Author

tsalo commented Feb 15, 2019

I think comparing to both OC and ~30ms is a great idea.

When we analyze a dataset with a relatively large number of echoes (i.e., five, realistically), we could also run the analyses with various numbers of echoes included to predict how number of echoes impacts power and other metrics of interest. That would be a lot of work, but it might be worth it.

@tsalo
Copy link
Member Author

tsalo commented Apr 15, 2019

I just want to link to this comment in ME-ICA/tedana#153. The work done by @cjl2007 to improve his own component selection could be used here to evaluate tedana's performance. I believe that the evaluation of component classifications betters fits with this analysis than the reliability analysis.

@handwerkerd
Copy link
Member

Additional metrics that were discussed at OHBM2019:
Contrast-to-noise for runs with task data. In Regions of Interest pre-specified for each data set?
Give more thought to what we mean by TSNR. Mean/Standard Devision in whole brain? white matter? ventricles? ROIs?

@tsalo
Copy link
Member Author

tsalo commented Nov 19, 2019

Given our renewed interest in getting a paper out, I'd like to revisit this issue. I tried to summarize the metrics a bit more. There's probably a lot of overlap between some (e.g., power analysis and parameter estimates) and some are probably not useful (e.g., TSNR). Plus I don't think the list is comprehensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants