[WIP, ENH] Add ability to read surface .gii files #22

rmarkello · 2018-04-27T14:56:06Z

Working to add the ability to seamlessly accept gifti files in addition to nifti files.

This is still a long ways from ready, but I want to use circle testing to ensure I don't screw up anything in the process of making changes. As I'm modifying the functions to accept surface files, I'm making some stylistic changes along the way and beefing up doc-strings, as applicable.

Modified some code in utils.py to be more readable and to accept various input types (e.g., lists of data files, arrays, etc.). In the process, changed a some doc-strings, modified the print statements to be logging statements, and made a few comments for identification of things that need to be changed to better integrate surface files.

In the process of adding gifti support, but breaking EVERYTHING. Ensuring that all stages of `tedana` do not require spatial information about the input data. Making minor aesthetic and stylistic updates as I go through the code.

Everything is still very broken as I'm working to through to fix all the functions to work with 3D arrays (samples x echos x time). I found the place where things start to diverge from the previous versions, but it seems to be due to numerical instabilities? Unclear. Will figure out later!

More major overhauls in the process of supporting GIFTI files. The `niwrite` functions has been killed to give way to `filewrite`. Lots of stylistic changes in the proces of integrating this throughout `tedana.interfaces.tedana`, but mostly just significant new functionality in `tedana.utils`.

Patches: * csstepdata saved as JSON rather than a text dump * Removes unused imports in test.utils to fix linting errors

rmarkello · 2018-05-07T19:31:45Z

I realized my original comment vastly understated the amount of refactoring that I'm doing to integrate gifti support.

This is going to be a bit of a beast of a PR, so perhaps some reviews along the way might be helpful to ensuring that I'm on the right track? I think that my refactoring of utils.py is mostly done (though there will be a few minor tweaks to accommodate L/R hemisphere denotations for gifti files), if someone would like to start there!

Patch errors

Added incredibly basic doc-strings to most of the functions in tedana.py

emdupre

I made a few comments on where I think things could be clarified in utils, but overall that module looks good !

emdupre · 2018-05-08T17:39:57Z

tedana/utils/utils.py

+        Unmasked `data` array
+    """
+
+    out = np.zeros((mask.shape + data.shape[1:]))


Is there a reason this is not out = np.zeros(mask.shape + data.shape[1:]) ?

Nope! Good catch -- will fix in the next commit.

emdupre · 2018-05-08T17:40:59Z

tedana/utils/utils.py

+
+    Parameters
+    ----------
+    data : (M x E x T) array_like


This notation is slightly confusing. Are M and S of equal size ?

Apologies for the confusion; I'll beef up the doc-string here. M refers to the number of True values in the mask that had been applied to the data.

emdupre · 2018-05-08T17:47:03Z

tedana/utils/utils.py

+        Masked `data`, where `S` is samples, `E` is echoes, and `T` is time
+    """
+
+    if mask is not None and not type(data) == type(mask):


Should we confirm that they're of the same shape, not just the same type ?

I'd actually removed almost all the calls to fmask() in the main tedana.py interface; since we're always working with data in (S x E x T) dimensionality, it became obsolete because masking with a boolean array of shape (S,) was as simple as data[mask]. Because mask in the main interface is generated from the data, I think we can remove this function entirely.

emdupre · 2018-05-08T17:49:03Z

tedana/utils/utils.py

+            # use nilearn for other files
+            data = check_niimg(data)
+            if mask is not None:
+                # TODO: check that this uses same order to flatten


Same order as what ?

This comment was a holdover from when load_data() and load_image() were explicitly reshaping the input data with order=F, which I've since removed. I had wanted to make sure that nilearn's apply_mask() function was doing the same flattening -- but since that's no longer applicable I'll remove this. And I'll try to be more explicit with my code-comments in the future, too.

emdupre · 2018-05-08T17:50:43Z

tedana/utils/utils.py

-    ndarray
-        Array of shape (nx, ny, nz[, Ne[, nt]])
+    fdata : (X x Y x M x T) np.ndarray
+        Z-concatenated multi-echo data array, where M is Z * number of echos


If we drop zcat compatibility (as discussed in #26), will we drop this function as well ? For now, it's probably worth keeping, but thinking ahead....

The uncat2echos function is actually no longer needed, I think -- even with retained zcat compatibility! A combination of load_data() and filewrite() should be enough to ensure we can handle everything. I'll delete this code to remove any confusion!

emdupre · 2018-05-08T17:55:09Z

tedana/utils/utils.py

+    return fdata, ref_img
+
+
+def makeadmask(data, minimum=True, getsum=False):


Should we rename this? I'm still not sure what ad in makeadmask stands for.

Definitely can be renamed. I believe it stood for make adapative mask, but since that's clear to exactly no one, a more pythonic naming convention (make_adaptive_mask()) might be more applicable? I'm happy to do whatever works best!

emdupre · 2018-05-08T17:58:39Z

tedana/utils/utils.py

-    else:
-        nt = 1
-    return np.reshape(data, (nx, ny, nz, nt), order='F')
+    if isinstance(data, list):


Should we raise an error for the else condition, here? I'm not sure how it could be triggered, but just in case....

This is worth some discussion about how we envision people using the tedana interface. As it stands, the primary mechanism for doing that is through argparse in cli/run.py; if that's the case, data will always be a list. However, if the goal is to make this a bit more modular, then we should probably add in some other checks.

Right now even if data is a str pointing to a nifti file this function should still work (without an explicit else clause). But it might be good to enforce some requirements! I'm open to comments/suggestions and will be happy to build them in.

Yes, I agree there's a broader discussion to have about how to use the tedana interface-- I'll punt on that for now. It sounds like this can handle the two most likely cases, so we can revisit when it comes time to talk about how we expect people to run the code.

It's unclear how necessary this is, but I'm fairly sure some of the current component selection criteria use spatial information. For example, in tedana.py, see the current code starting at line 652: "Step 2: Compute 3-D spatial FFT of Beta maps to detect high-spatial
frequency artifacts"

The simple solution would be to just skip 3-D space dependent criteria with gii files. I suspect that won't cause much harm, but it would require a some close examination of results processed as volumes vs surfaces.

My hunch is that the spatial transforms and interpolations across voxels to go from a volume to a surface might cause some undesired issues with echo-dependent estimations, but there might also be some issues with non-trivial transforms & interpolations in volume space as well. I also don't have a good intuitive sense of how tedana classification mistakes would look in surface space, so there might need to be some thought on how to best visualize results to help users identify issues.

Good point, and thanks for raising it! Yes, there are going to have to be some differences between the pipelines for volumetric + surface data. Thankfully, having CircleCI running will ensure that none of the changes I'm making in this PR impact the volumetric results in any way, for now.

A lot of this PR is actually just refactoring to ensure that data can be seamlessly passed between functions without regard for the type of data being passed. This will hopefully be useful later in making the code more modular.

As for visualization of surface-based results: I completely agree. I think there will have to be some significant testing to ensure that tedana is actually suitable for surface data, given the limited spatial information. Hopefully this PR will be the first step in that process!

Also, other updates to `selcomps()` to hopefully get it passing.

To aid in merging #22, where we've collapsed the xyz dimensions into an S dimension. Tracing back on a numpy core bug with large arrays: numpy/numpy#8869 (comment)

rmarkello · 2018-05-10T13:55:15Z

It passes! Which means this should be ready for review. 🎉

I haven't actually tested whether surface data can be run through yet, but it's in a much better place to incorporate those data. I think it might be beneficial to merge this in and then work out the surface bugs (and add some tests to ensure they work) afterwards.

Note: in the process of making these changes this I experienced an error wherein taking the mean of a large, float32 numpy array gave varying results across Python instances, despite the underlying data being exactly the same to an incredibly conservative tolerance. Because of this, this PR includes a (very slight) breaking change; the CI was updated to reflect these new (hopefully more accurate) outputs.

emdupre

Looks fantastic ! Some minor questions, but I think this is ready.

emdupre · 2018-05-10T14:01:41Z

tedana/interfaces/t2smap.py

+        x = np.column_stack([np.ones(echo), [-te for te in tes[:echo]]])
+        X = np.repeat(x, n_vols, axis=0)
+
+        beta, res, rank, sing = np.linalg.lstsq(X, B)


Is there a reason these variables were added back in ? Previously it was
beta, _, _, _ = np.linalg.lstsq(X, B)

No! And indeed, I think a more parsimonious solution than either of the above would probably be:

beta = np.linalg.lstsq(X, B)[0]

Not sure my rationale here, so I'll change it to reflect this.

emdupre · 2018-05-10T14:07:58Z

tedana/interfaces/tedana.py

@@ -63,145 +68,122 @@ def do_svm(X_train, y_train, X_test, svmtype=0):
    elif svmtype == 2:
        clf = svm.SVC(kernel='linear', probability=True)
    else:
-        raise ValueError('Input svmtype not in range (3)')
+        raise ValueError('Input svmtype not in [1, 2, 3]')


Aren't the possible types [0, 1, 2] ?

Yes! Will fix asap.

emdupre · 2018-05-10T14:08:50Z

tedana/interfaces/tedana.py

-        infile = '__clin.nii.gz'
+        infile = filewrite(unmask(data, mask), '__clin.nii.gz', ref_img, gzip=True)
+
+    # FIXME: ideally no calls to os.system!!! (or AFNI, for that matter)


We should open an issue for this 😄

Done! See #32.

emdupre · 2018-05-10T14:12:41Z

tedana/interfaces/tedana.py

-    mu = catd.mean(axis=-1)
-    tes = np.reshape(tes, (n_echoes, 1))
-    fmin, fmid, fmax = getfbounds(n_echoes)
+    mu = catd.mean(axis=-1, dtype=float)  # BUG: THIS IS THE BAD PLACE


As much as I love / hate this code comment, do you think the explanation in the PR is enough to remove it for now ?

Yes, this can definitely be removed since there is a trace of the rationale behind changing it. It might be worth following numpy development and reverting this change eventually, but I think that will be further down the line.

Minor changes to address review comments for PR #22

emdupre · 2018-05-10T15:17:21Z

LGTM ! Merging !

rmarkello added 3 commits April 27, 2018 10:50

[FIX] Screwed up nibabel import

cdc8cf0

[RF] Breaks everything; working to add gifti support

a94692b

In the process of adding gifti support, but breaking EVERYTHING. Ensuring that all stages of `tedana` do not require spatial information about the input data. Making minor aesthetic and stylistic updates as I go through the code.

emdupre mentioned this pull request May 2, 2018

[WIP, ENH] Allow separate data files as input. #24

Merged

rmarkello and others added 7 commits May 4, 2018 01:04

Merge remote-tracking branch 'upstream/master' into gifti

61b96ae

Update csstepdata format as JSON

9785e42

Remove currently unused imports for linting

a4d9f5b

Address review comments

e9994cd

Merge pull request #1 from emdupre/master

b210d2b

Patches: * csstepdata saved as JSON rather than a text dump * Removes unused imports in test.utils to fix linting errors

emdupre and others added 3 commits May 7, 2018 15:35

Patch errors

047242d

Merge pull request #2 from emdupre/master

07ebb15

Patch errors

[DOC] Doc-strings galore! Very basic but a start

bb7eb17

Added incredibly basic doc-strings to most of the functions in tedana.py

emdupre reviewed May 8, 2018

View reviewed changes

rmarkello added 6 commits May 8, 2018 22:17

[RF] Addressed reviews and minor updates

0e9e938

[FIX] Needed order=F for when zcat data provided

a784e86

[FIX] Allow N-dimensional arrays in utils.andb

4d76812

[FIX] Unmask not retaining dtype

7f70268

Also, other updates to `selcomps()` to hopefully get it passing.

[FIX] Bugs in selcomps for gifti/nifti

b8b5957

[FIX] Minor updates to doc-strings + names

ddafa7c

emdupre added a commit that referenced this pull request May 10, 2018

Update catd mean to float64

074b03c

To aid in merging #22, where we've collapsed the xyz dimensions into an S dimension. Tracing back on a numpy core bug with large arrays: numpy/numpy#8869 (comment)

emdupre mentioned this pull request May 10, 2018

Update catd mean to float64 #31

Closed

[FIX] float32 / float64 bug in fitmodels_direct()

f9810ad

emdupre approved these changes May 10, 2018

View reviewed changes

[FIX] Address review comments for #22

ef6c34d

Minor changes to address review comments for PR #22

emdupre merged commit 99120cc into ME-ICA:master May 10, 2018

This was referenced May 10, 2018

Handle per-hemisphere surface files #33

Closed

Convert text dumps to json format #28

Closed

emdupre mentioned this pull request May 11, 2018

Convert all docstrings to numpy style #13

Closed

rmarkello deleted the gifti branch September 15, 2018 14:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP, ENH] Add ability to read surface .gii files #22

[WIP, ENH] Add ability to read surface .gii files #22

rmarkello commented Apr 27, 2018

rmarkello commented May 7, 2018

emdupre left a comment •

edited

Loading

emdupre May 8, 2018

rmarkello May 9, 2018

emdupre May 8, 2018

rmarkello May 9, 2018

emdupre May 8, 2018

rmarkello May 9, 2018 •

edited

Loading

emdupre May 8, 2018

rmarkello May 9, 2018

emdupre May 8, 2018

rmarkello May 9, 2018

emdupre May 8, 2018

rmarkello May 9, 2018

emdupre May 8, 2018

rmarkello May 9, 2018

emdupre May 9, 2018

handwerkerd May 9, 2018

rmarkello May 9, 2018

rmarkello commented May 10, 2018

emdupre left a comment

emdupre May 10, 2018

rmarkello May 10, 2018

emdupre May 10, 2018

rmarkello May 10, 2018

emdupre May 10, 2018

rmarkello May 10, 2018

emdupre May 10, 2018

rmarkello May 10, 2018

rmarkello May 10, 2018

emdupre commented May 10, 2018

		return fdata, ref_img


		def makeadmask(data, minimum=True, getsum=False):

[WIP, ENH] Add ability to read surface .gii files #22

[WIP, ENH] Add ability to read surface .gii files #22

Conversation

rmarkello commented Apr 27, 2018

rmarkello commented May 7, 2018

emdupre left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmarkello May 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmarkello commented May 10, 2018

emdupre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emdupre commented May 10, 2018

emdupre left a comment •

edited

Loading

rmarkello May 9, 2018 •

edited

Loading