Skip to content

Commit

Permalink
Merge pull request #133 from tsalo/doc-pipeline
Browse files Browse the repository at this point in the history
[DOC] Improve documentation for pipeline
  • Loading branch information
emdupre authored Oct 31, 2018
2 parents a7f468e + 2cc8680 commit 42b5bad
Show file tree
Hide file tree
Showing 34 changed files with 1,502 additions and 192 deletions.
File renamed without changes.
76 changes: 36 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# tedana
tedana: TE Dependent ANAlysis
=============================

`TE`-`de`pendent `ana`lysis (_tedana_) is a Python module for denoising multi-echo functional magnetic resonance imaging (fMRI) data.
The ``tedana`` package is part of the ME-ICA pipeline, performing TE-dependent
analysis of multi-echo functional magnetic resonance imaging (fMRI) data.
``TE``-``de``pendent ``ana``lysis (``tedana``) is a Python module for denoising
multi-echo functional magnetic resonance imaging (fMRI) data.

[![Latest Version](https://img.shields.io/pypi/v/tedana.svg)](https://pypi.python.org/pypi/tedana/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tedana.svg)](https://pypi.python.org/pypi/tedana/)
Expand All @@ -11,55 +15,47 @@
[![Codecov](https://codecov.io/gh/me-ica/tedana/branch/master/graph/badge.svg)](https://codecov.io/gh/me-ica/tedana)
[![Join the chat at https://gitter.im/ME-ICA/tedana](https://badges.gitter.im/ME-ICA/tedana.svg)](https://gitter.im/ME-ICA/tedana?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

![](https://user-images.githubusercontent.com/7406227/40031156-57b7cbb8-57bc-11e8-8c51-5b29f2e86a48.png)
About
-----

``tedana`` originally came about as a part of the [ME-ICA](https://github.com/me-ica/me-ica) pipeline.
The ME-ICA pipeline originally performed both pre-processing and TE-dependent
analysis of multi-echo fMRI data; however, ``tedana`` now assumes that you're
working with data which has been previously preprocessed.
If you're in need of a preprocessing pipeline, we recommend
[fmriprep](https://github.com/poldracklab/fmriprep/), which has been tested
for compatibility with multi-echo fMRI data and ``tedana``.

## About
![http://tedana.readthedocs.io/](https://user-images.githubusercontent.com/7406227/40031156-57b7cbb8-57bc-11e8-8c51-5b29f2e86a48.png)

`tedana` originally came about as a part of the [`ME-ICA`](https://github.com/me-ica/me-ica) pipeline.
The ME-ICA pipeline orignially performed both pre-processing and TE-dependent analysis of multi-echo fMRI data; however, `tedana` now assumes that you're working with data which has been previously preprocessed.
If you're in need of a pre-processing pipeline, we recommend [`fmriprep`](https://github.com/poldracklab/fmriprep/) which has been tested for compatibility with multi-echo fMRI data and `tedana`.
Installation
------------

### Why Multi-Echo?
You'll need to set up a working development environment to use ``tedana``.
To set up a local environment, you will need Python >=3.6 and the following
packages will need to be installed:

Multi-echo fMRI data is obtained by acquiring multiple TEs (commonly called [echo times](http://mriquestions.com/tr-and-te.html)) for each MRI volume during data collection.
While fMRI signal contains important neural information (termed the blood oxygen-level dependent, or [BOLD signal](http://www.fil.ion.ucl.ac.uk/spm/course/slides10-zurich/Kerstin_BOLD.pdf)), it also contains "noise" (termed non-BOLD signal) caused by things like participant motion and changes in breathing.
Because the BOLD signal is known to decay at a set rate, collecting multiple echos allows us to assess whether components of the fMRI signal are BOLD- or non-BOLD.
For a comprehensive review, see [Kundu et al. (2017), _NeuroImage_](https://paperpile.com/shared/eH3PPu).
- mdp
- nilearn
- nibabel>=2.1.0
- numpy
- scikit-learn
- scipy

In `tedana`, we take the time series from all the collected TEs, combine them, and decompose the resulting data into components that can be classified as BOLD or non-BOLD. This is performed in a series of steps including:
You can then install ``tedana`` with:

* Principal components analysis
* Independent components analysis
* Component classification

More information and documentation can be found at https://tedana.readthedocs.io/.

## Installation

You'll need to set up a working development environment to use `tedana`.
To set up a local environment, you will need Python >=3.6 and the following packages will need to be installed:

mdp
nilearn
nibabel>=2.1.0
numpy
scikit-learn
scipy

You can then install `tedana` with

```
```bash
pip install tedana
```

## Getting involved
Getting involved
----------------

We :yellow_heart: new contributors !
We :yellow_heart: new contributors!
To get started, check out [our contributing guidelines](https://github.com/ME-ICA/tedana/blob/master/CONTRIBUTING.md).

Want to learn more about our plans for developing `tedana` ?
Have a question, comment, or suggestion ?
Open or comment on one of [our issues](https://github.com/ME-ICA/tedana/issues) !
Want to learn more about our plans for developing ``tedana``?
Have a question, comment, or suggestion?
Open or comment on one of [our issues](https://github.com/ME-ICA/tedana/issues)!

We ask that all contributions to `tedana` respect our [code of conduct](https://github.com/ME-ICA/tedana/blob/master/Code_of_Conduct.md).
We ask that all contributions to ``tedana`` respect our [code of conduct](https://github.com/ME-ICA/tedana/blob/master/CODE_OF_CONDUCT.md).
Binary file added docs/_static/01_echo_timeseries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/02_echo_value_distributions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/03_adaptive_mask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/04_echo_log_value_distributions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/05_loglinear_regression.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/06_monoexponential_decay_model.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/11_pca_component_timeseries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/12_pca_whitened_data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/13_ica_component_timeseries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/15_denoised_data_timeseries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/16_t1c_denoised_data_timeseries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
557 changes: 557 additions & 0 deletions docs/_static/optimal_combination_workflow_plots.ipynb

Large diffs are not rendered by default.

Binary file added docs/_static/tedana-poster.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/tedana-workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
449 changes: 449 additions & 0 deletions docs/_static/tedana_workflow_plots.ipynb

Large diffs are not rendered by default.

230 changes: 205 additions & 25 deletions docs/approach.rst
Original file line number Diff line number Diff line change
@@ -1,32 +1,212 @@
tedana's approach
=================
Processing pipeline details
===========================

``tedana`` works by decomposing multi-echo BOLD data via PCA and ICA.
These components are then analyzed to determine whether they are TE-dependent
or -independent. TE-dependent components are classified as BOLD, while
TE-independent components are classified as non-BOLD, and are discarded as part
of data cleaning.

Derivatives
-----------

* ``medn``
'Denoised' BOLD time series after: basic preprocessing,
T2* weighted averaging of echoes (i.e. 'optimal combination'),
ICA denoising.
Use this dataset for task analysis and resting state time series correlation
analysis.
* ``tsoc``
'Raw' BOLD time series dataset after: basic preprocessing
and T2* weighted averaging of echoes (i.e. 'optimal combination').
'Standard' denoising or task analyses can be assessed on this dataset
(e.g. motion regression, physio correction, scrubbing, etc.)
for comparison to ME-ICA denoising.
* ``*mefc``
Component maps (in units of \delta S) of accepted BOLD ICA components.
Use this dataset for ME-ICR seed-based connectivity analysis.
* ``mefl``
Component maps (in units of \delta S) of ALL ICA components.
* ``ctab``
Table of component Kappa, Rho, and variance explained values, plus listing
of component classifications.
In ``tedana``, we take the time series from all the collected TEs, combine them,
and decompose the resulting data into components that can be classified as BOLD
or non-BOLD. This is performed in a series of steps, including:

* Principal components analysis
* Independent components analysis
* Component classification

.. image:: /_static/tedana-workflow.png
:align: center

Multi-echo data
```````````````

Here are the echo-specific time series for a single voxel in an example
resting-state scan with 5 echoes.

.. image:: /_static/01_echo_timeseries.png
:align: center

The values across volumes for this voxel scale with echo time in a predictable
manner.

.. image:: /_static/02_echo_value_distributions.png
:width: 400 px
:align: center

Adaptive mask generation
````````````````````````
Longer echo times are more susceptible to signal dropout, which means that
certain brain regions (e.g., orbitofrontal cortex, temporal poles) will only
have good signal for some echoes. In order to avoid using bad signal from
affected echoes in calculating :math:`T_{2}^*` and :math:`S_{0}` for a given voxel,
``tedana`` generates an adaptive mask, where the value for each voxel is the
number of echoes with "good" signal. When :math:`T_{2}^*` and :math:`S_{0}` are
calculated below, each voxel's values are only calculated from the first :math:`n`
echoes, where :math:`n` is the value for that voxel in the adaptive mask.

.. image:: /_static/03_adaptive_mask.png
:width: 600 px
:align: center

Monoexponential decay model fit
```````````````````````````````
The next step is to fit a monoexponential decay model to the data in order to
estimate voxel-wise :math:`T_{2}^*` and :math:`S_0`.

In order to make it easier to fit the decay model to the data, ``tedana``
transforms the data. The BOLD data are transformed as :math:`log(|S|+1)`, where
:math:`S` is the BOLD signal. The echo times are also multiplied by -1.

.. image:: /_static/04_echo_log_value_distributions.png
:width: 400 px
:align: center

A simple line can then be fit to the transformed data with linear regression.
For the sake of this introduction, we can assume that the example voxel has
good signal in all five echoes (i.e., the adaptive mask has a value of 5 at
this voxel), so the line is fit to all available data.

.. note::
``tedana`` actually performs and uses two sets of :math:`T_{2}^*`/:math:`S_0` model fits.
In one case, ``tedana`` estimates :math:`T_{2}^*` and :math:`S_0` for voxels with good signal in at
least two echoes. The resulting "limited" :math:`T_{2}^*` and :math:`S_0` maps are used throughout
most of the pipeline. In the other case, ``tedana`` estimates :math:`T_{2}^*` and :math:`S_0` for voxels
with good data in only one echo as well, but uses the first two echoes for
those voxels. The resulting "full" :math:`T_{2}^*` and :math:`S_0` maps are used to generate the
optimally combined data.

.. image:: /_static/05_loglinear_regression.png
:width: 400 px
:align: center

The values of interest for the decay model, :math:`S_0` and :math:`T_{2}^*`,
are then simple transformations of the line's intercept (:math:`B_{0}`) and
slope (:math:`B_{1}`), respectively:

.. math:: S_{0} = e^{B_{0}}

.. math:: T_{2}^{*} = \frac{1}{B_{1}}

The resulting values can be used to show the fitted monoexponential decay model
on the original data.

.. image:: /_static/06_monoexponential_decay_model.png
:width: 400 px
:align: center

We can also see where :math:`T_{2}^*` lands on this curve.

.. image:: /_static/07_monoexponential_decay_model_with_t2.png
:width: 400 px
:align: center

Optimal combination
```````````````````
Using the :math:`T_{2}^*` estimates, ``tedana`` combines signal across echoes using a
weighted average. The echoes are weighted according to the formula

.. math:: w_{TE} = TE * e^{\frac{-TE}{T_{2}^*}}

The weights are then normalized across echoes. For the example voxel, the
resulting weights are:

.. image:: /_static/08_optimal_combination_echo_weights.png
:width: 400 px
:align: center

The distribution of values for the optimally combined data lands somewhere
between the distributions for other echoes.

.. image:: /_static/09_optimal_combination_value_distributions.png
:width: 400 px
:align: center

The time series for the optimally combined data also looks like a combination
of the other echoes (which it is).

.. image:: /_static/10_optimal_combination_timeseries.png
:align: center

TEDPCA
``````
The next step is to identify and temporarily remove Gaussian (thermal) noise
with TE-dependent principal components analysis (PCA). TEDPCA applies PCA to
the optimally combined data in order to decompose it into component maps and
time series. Here we can see time series for some example components (we don't
really care about the maps):

.. image:: /_static/11_pca_component_timeseries.png

These components are subjected to component selection, the
specifics of which vary according to algorithm.

In the simplest approach, ``tedana`` uses Minka’s MLE to estimate the
dimensionality of the data, which disregards low-variance components.

A more complicated approach involves applying a decision tree to identify and
discard PCA components which, in addition to not explaining much variance,
are also not significantly TE-dependent (i.e., have low Kappa) or
TE-independent (i.e., have low Rho).

After component selection is performed, the retained components and their
associated betas are used to reconstruct the optimally combined data, resulting
in a dimensionally reduced (i.e., whitened) version of the dataset.

.. image:: /_static/12_pca_whitened_data.png

TEDICA
``````
Next, ``tedana`` applies TE-dependent independent components analysis (ICA) in
order to identify and remove TE-independent (i.e., non-BOLD noise) components.
The dimensionally reduced optimally combined data are first subjected to ICA in
order to fit a mixing matrix to the whitened data.

.. image:: /_static/13_ica_component_timeseries.png

Linear regression is used to fit the component time series to each voxel in each
echo from the original, echo-specific data. This way, the thermal noise is
retained in the data, but is ignored by the TEDICA process. This results in
echo- and voxel-specific betas for each of the components.

TE-dependence (:math:`R_2`) and TE-independence (:math:`S_0`) models can then
be fit to these betas. These models allow calculation of F-statistics for the
:math:`R_2` and :math:`S_0` models (referred to as :math:`\kappa` and
:math:`\rho`, respectively).

.. image:: /_static/14_te_dependence_models_component_0.png
:width: 400 px
:align: center

.. image:: /_static/14_te_dependence_models_component_1.png
:width: 400 px
:align: center

.. image:: /_static/14_te_dependence_models_component_2.png
:width: 400 px
:align: center

A decision tree is applied to :math:`\kappa`, :math:`\rho`, and other metrics in order to
classify ICA components as TE-dependent (BOLD signal), TE-independent
(non-BOLD noise), or neither (to be ignored). The actual decision tree is
dependent on the component selection algorithm employed. ``tedana`` includes
two options: `kundu_v2_5` (which uses hardcoded thresholds applied to each of
the metrics) and `kundu_v3_2` (which trains a classifier to select components).

.. image:: /_static/15_denoised_data_timeseries.png

Removal of spatially diffuse noise (optional)
`````````````````````````````````````````````
Due to the constraints of ICA, MEICA is able to identify and remove spatially
localized noise components, but it cannot identify components that are spread
out throughout the whole brain. See `Power et al. (2018)`_ for more information
about this issue.
One of several post-processing strategies may be applied to the ME-DN or ME-HK
datasets in order to remove spatially diffuse (ostensibly respiration-related)
noise. Methods which have been employed in the past include global signal
regression (GSR), T1c-GSR, anatomical CompCor, Go Decomposition (GODEC), and
robust PCA.

.. image:: /_static/16_t1c_denoised_data_timeseries.png

.. _Power et al. (2018): http://www.pnas.org/content/early/2018/02/07/1720985115.short
6 changes: 3 additions & 3 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ For a more general guide to the tedana development, please see our
`contributing guide`_. Please also follow our `code of conduct`_.

.. _contributing guide: https://github.com/ME-ICA/tedana/blob/master/CONTRIBUTING.md
.. _code of conduct: https://github.com/ME-ICA/tedana/blob/master/Code_of_Conduct.md
.. _code of conduct: https://github.com/ME-ICA/tedana/blob/master/CODE_OF_CONDUCT.md


Style Guide
Expand Down Expand Up @@ -44,7 +44,7 @@ This tells the development team that your pull request is a "work-in-progress",
and that you plan to continue working on it.

Release Checklist
`````````````````
-----------------

This is the checklist of items that must be completed when cutting a new release of tedana.
These steps can only be completed by a project maintainer, but they are a good resource for
Expand All @@ -55,7 +55,7 @@ releasing your own Python projects!
`Release-drafter`_ should have already drafted release notes listing all
changes since the last release; check to make sure these are correct.
#. Pulling from the ``master`` branch, locally build a new copy of tedana and
`upload it to PyPi`_.
`upload it to PyPi`_.

We have set up tedana so that releases automatically mint a new DOI with Zenodo;
a guide for doing this integration is available `here`_.
Expand Down
Loading

0 comments on commit 42b5bad

Please sign in to comment.