Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2D FCMAE #71

Merged
merged 80 commits into from
Jun 11, 2024
Merged

2D FCMAE #71

merged 80 commits into from
Jun 11, 2024

Conversation

ziw-liu
Copy link
Collaborator

@ziw-liu ziw-liu commented Mar 4, 2024

Add option to not perform convolution in the head of the FCMAE models.

Fixes for multi-dataset trainer.

Fix default when computing metrics on images with very high instance counts.

Base automatically changed from fcmae to main April 8, 2024 16:22
@ziw-liu ziw-liu mentioned this pull request Jun 3, 2024
@ziw-liu ziw-liu marked this pull request as ready for review June 5, 2024 12:45
self._log_samples("val_samples", self.validation_step_outputs)
self.validation_step_outputs = []
# average within each dataloader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I know that this is the end of the validation meaning that it won't backprop and therefore maybe no need to detach the tensor before doing any logging. Is this the common practice? I know that for the train_step and validation_step is more relevant and important If we don't detach here it doesnt affect it any way? Just curious..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To log the loss value (through lightning logger), the detaching is automatic. We only need to care about it when logging manually (images).

Copy link
Contributor

@edyoshikun edyoshikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested doing the following without issues:

  • 2D predictions using a549 2D data
  • 3D predictions using mantis A549 data

I don't expect it to break anything related to the dataloaders since I was using the hcs.py. I like the prepare_data_per_node flags for caching. I will just need to update my scripts to add this flag from now on.

Thank you so much for this Ziwen!

@ziw-liu ziw-liu merged commit a707e1b into main Jun 11, 2024
3 checks passed
@ziw-liu ziw-liu deleted the 2d-fcmae branch June 11, 2024 19:04
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 12, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
edyoshikun added a commit that referenced this pull request Jun 18, 2024
* refactor data loading into its own module

* update type annotations

* move the logging module out

* move old logging into utils

* rename tests to match module name

* bump torch

* draft fcmae encoder

* add stem to the encoder

* wip: masked stem layernorm

* wip: patchify masked features for linear

* use mlp from timm

* hack: POC training script for FCMAE

* fix mask for fitting

* remove training script

* default architecture

* fine-tuning options

* fix cli for finetuning

* draft combined data module

* fix import

* manual validation loss reduction

* update linting
new black version has different rules

* update development guide

* update type hints

* bump iohub

* draft ctmc v1 dataset

* update tests

* move test_data

* remove path conversion

* configurable normalizations (#68)

* inital commit adding the normalization.

* adding dataset_statistics to each fov to facilitate the configurable augmentations

* fix indentation

* ruff

* test preprocessing

* remove redundant field

* cleanup

---------

Co-authored-by: Ziwen Liu <[email protected]>

* fix ctmc dataloading

* add example ctmc v1 loading script

* changing the normalization and augmentations default from None to empty list.

* invert intensity transform

* concatenated data module

* subsample videos

* livecell dataset

* all sample fields are optional

* fix multi-dataloader validation

* lint

* fixing preprocessing for varying array shapes (i.e aics dataset)

* update loading scripts

* fix CombineMode

* always use untrainable head for FCMAE

* move log values to GPU before syncing
Lightning-AI/pytorch-lightning#18803

* custom head

* ddp caching fixes

* fix caching when using combined loader

* compose normalizations for predict and test stages

* black

* fix normalization in example config

* fix normalization in example config

* prefetch more in validation

* fix collate when multi-sample transform is not used

* ddp caching fixes

* fix caching when using combined loader

* typing fixes

* fix test dataset

* fix invert transform

* add ddp prepare flag for combined data module

* remove redundant operations

* filter empty detections

* pass trainer to underlying data modules in concatenated

* hack: add test dataloader for LiveCell dataset

* test datasets for livecell and ctmc

* fix merge error

* fix merge error

* fix mAP default for over 100 detections

* bump torchmetric

* fix combined loader training for virtual staining task

* fix non-combined data loader training

* add fcmae to graph script

* fix type hint

* format

* add back convolutiuon option for fcmae head

---------

Co-authored-by: Eduardo Hirata-Miyasaki <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants