-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Workflow and CBMAWorkflow classes. Support pairwise CBMA workflows #809
Conversation
@jdkent We are getting an error while building the documentation. It seems related to the nimads source file:
|
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #809 +/- ##
==========================================
+ Coverage 88.37% 88.39% +0.02%
==========================================
Files 46 47 +1
Lines 5943 6024 +81
==========================================
+ Hits 5252 5325 +73
- Misses 691 699 +8
☔ View full report in Codecov by Sentry. |
Thanks for the heads up, I should have fixed the issue now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just have a couple comments so far.
Could you add a report that showcases the output of a pairwise estimator?
you can use this existing example: 08_plot_cbma_subtraction_conjunction.py
nimare/diagnostics.py
Outdated
@@ -204,8 +202,14 @@ def transform(self, result): | |||
|
|||
contribution_tables.append(contribution_table.reset_index()) | |||
|
|||
# Concat PositiveTail and NegativeTail tables | |||
contribution_table = pd.concat(contribution_tables, ignore_index=True, sort=False) | |||
if pairwaise_estimators or len(label_maps) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For MKDAChi2, the z-values can be positive or negative (you can check the z-map results from running the test in this pull request: #811).
I think we would want to display both Positive and Negative Tail cluster with Pairwise estimators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would want to display both Positive and Negative Tail cluster with Pairwise estimators.
We are displaying Positive and Negative tail cluster tables and maps. Here, I'm only excluding the Negative tail contribution table, which will be zero for studies in id1
. I think if we want to display that table we would need to plot it in a separate figure and use the id2 for estimating the values. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in the report, we only show the summary for id1
and coordinates1
from dataset1. Should we include another section for the second group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little more questionable, overall, I would say yes, we do want to display id2
for the coordinates. for the contribution table, this is a bit debatable. With ALESubtraction, yes, I think it is useful to know how both groups contributed to the existing clusters.
For the MKDAChi2 I think it is less useful to see a contribution table (and with jacknife it would be computationally prohibitive if one was using the entire neurosynth dataset).
But the notion that ALESubtraction is used for more similarly sized groups and MKDAChi2 is used for comparing a relatively smaller dataset to a huge dataset is only reflective of how we are currently the algorithms, there's nothing stopping you from using ALESubtraction with neurosynth.
So I think the answer would be creating an option/flag for the user to decide whether they want to show the second group or not. could be called display_second_group
or something, with a default value of False
.
nimare/workflows/__init__.py
Outdated
__all__ = ["ale_sleuth_workflow", "cbma_workflow", "macm_workflow"] | ||
__all__ = [ | ||
"ale_sleuth_workflow", | ||
"Workflow", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the Workflow
baseclass from __all__
, this is not a public facing class (e.g., do not want users importing and trying to use it by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your work on this!, I have a couple comments/questions for idea on how to restructure small sections of code.
nimare/diagnostics.py
Outdated
@@ -274,9 +295,12 @@ def _transform(self, expid, label_map, result): | |||
# with one missing a study in its inputs. | |||
estimator = copy.deepcopy(result.estimator) | |||
|
|||
dset = estimator.dataset | |||
if self._is_pairwaise_estimator: | |||
all_ids = estimator.inputs_["id1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would all ids be id1
and id2
, the test case in the test suite is an unusual scenerio where the same dataset is used for id1
and id2
.
nimare/diagnostics.py
Outdated
temp_dset = ( | ||
estimator.dataset1.slice(other_ids) | ||
if sign == "PositiveTail" | ||
else estimator.dataset2.slice(other_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for "NegativeTail" is the temp result fitting dataset2 (minus one analysis) and dataset2, but should be dataset1, dataset2 (minus one analysis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I forgot to cover these cases.
nimare/diagnostics.py
Outdated
elif "stat_desc-group1MinusGroup2" in result.maps: | ||
target_value_map = "stat_desc-group1MinusGroup2" | ||
elif "z_desc-specificity" in result.maps: | ||
target_value_map = "z_desc-specificity" | ||
else: | ||
target_value_map = "z" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be replaced by taking the union of two sets? one set being ("est", "stat", "stat_desc-group1MinusGroup2", "z_desc-specificity") and the other being the result.maps, then if the union of the sets is empty, have the target_value_map be "z".
Or taking a step back, the purpose of this is to find the uncorrected values, correct? (if the target_image was the corrected values, so one could theoretically manipulate the target_image name to create the target_value_map name, then it would just be a parameter instead of a list of if statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be replaced by taking the union of two sets?
Good idea!
if the target_image was the corrected values, so one could theoretically manipulate the target_image name to create the target_value_map name, then it would just be a parameter instead of a list of if statements.
That's what I initially tried to do, but I think the logic here is to use stat maps if possible, if they are not available then use z maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "PositiveTail" and "NegativeTail" should be defined as variables and described since they are written as strings a few too many times, so it could be hard to get the context of what the string means and more difficult to change in the future if we have to change every string instead of a variable.
Great point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, just another minor comment, I think after that it looks good to me.
@@ -129,7 +131,7 @@ | |||
) | |||
related_corrected_results = jackknife.transform(related_corrected_results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wasn't from this pull request, but I think this variable should have a different name, like related_diagnostic_results
, so that related_corrected_results
is not overwritten. follows the convention of going from regular results to corrected results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @JulioAPeraza
Closes None.
Changes proposed in this pull request: