Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MultiImageFolder dataset #1345

Closed
wants to merge 10 commits into from

Conversation

SebastienEske
Copy link

This new kind of dataset enables the user to use neural networks that have an arbitrary number of input/output images such as for super resolution, image transformation, segmentation, depth computation, etc.

Signed-off-by: Sebastien ESKENAZI [email protected]

@SebastienEske
Copy link
Author

SebastienEske commented Sep 18, 2019

Related to these pull requests #1334 #1330 and this issue #424

@SebastienEske
Copy link
Author

If needed, I can add an example usage in the doc

transforms = [
    transforms.Compose([transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [1, 1, 1])]), \
    transforms.ToTensor() \
]

rootdir = '/data'
data_dirs = [os.path.join(rootdir, 'lowres'), \
             os.path.join(rootdir, 'highres'), \
            ]
dataset = torchvision.dataset.MultiImageFolder(train_data_dirs, train_transforms)

@Noiredd
Copy link

Noiredd commented Sep 18, 2019

  1. How is this approach different than just using multiple instances of ImageFolder, each with its own transform? Couldn't I accomplish the same thing by just constructing separate ImageFolders, each with their own transform, and passing them to an object that would do something like the following?
class BundleDataset():
    def __init__(self, *datasets):
        self.datasets = datasets
        # here go sanity checks like asserting that all datasets are equal length
    def __getitem__(self, index):
        return [dataset[index] for dataset in self.datasets]
    def __len__(self):
        return len(self.datasets[0])
  1. This proposal does not seem to allow using random transforms such as RandomResizedCrop. That is, since you have two entirely separate transforms, images would be cropped differently than labels (because each random transform generates its own random parameters). Ensuring spatial correspondence between images and labels is a crucial issue for tasks like segmentation. Do you plan to work around this?
    Providing a coupling between random transforms is a key feature of MultiCompose and SegmentationCompose [proof-of-concept] #1315.

@SebastienEske
Copy link
Author

SebastienEske commented Sep 18, 2019

  1. For the bundle dataset, you would need to check the length of each dataset in order to ensure their consistency. It has the advantage of being more flexible (any dataset works) but then you need to construct each dataset separately.

Another big difference is that the folder structure is very different, there are no classes here.

  1. Ah!! Indeed. I did not think about random crops... I usually precompute the crops in order to save computation time during training so there is no cropping in my training datasets...

Another way to go about this would be to put it in the training code:

inputs = torch.cat((input, labels), 1)
inputs = some_random_crop_function(inputs)
input = inputs[:,:-1,...]
labels = inputs[:,-1,...]

And to answer @Noiredd, I don't plan to handle transform synchronization in the dataset as it is a transform problem and not a dataset problem. Sorry about linking your PR here. I read it too fast.

@Noiredd
Copy link

Noiredd commented Sep 18, 2019

  1. I wouldn't say that "any dataset works", because this uses the same loader for all datasets. Someone might want to load their images and labels differently - and in this case the bundle would seem more flexible for them. But yes, for simple tasks your approach might be a little more straightforward.
  2. Even if you could go around crops (btw torch.cat won't work because labels are torch.long and images usually torch.float), they are just one example from a whole array of random transforms. I've proposed a pretty general solution for that in MultiCompose and SegmentationCompose [proof-of-concept] #1315.

I'm not saying you should close this because (1) I'm not a team member but just another user, and (2) you've obviously put some work into this :) While personally I don't think this is the right way to go, maybe some ideas could be salvaged and added to some other PR to make it better. Ultimately, I think we should wait to hear the opinion from someone of the PyTorch team.
All in all, maybe none of #1330, #1334 and this will be merged, but they are all quite clear signals that there is indeed a need for a built-in solution for segmentation (or: multi-image) data.

@codecov-io
Copy link

codecov-io commented Sep 18, 2019

Codecov Report

Merging #1345 into master will decrease coverage by 0.1%.
The diff coverage is 55.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1345      +/-   ##
==========================================
- Coverage   65.59%   65.48%   -0.11%     
==========================================
  Files          75       75              
  Lines        5819     5876      +57     
  Branches      892      913      +21     
==========================================
+ Hits         3817     3848      +31     
- Misses       1735     1755      +20     
- Partials      267      273       +6
Impacted Files Coverage Δ
torchvision/datasets/__init__.py 100% <100%> (ø) ⬆️
torchvision/datasets/folder.py 70.37% <54.38%> (-11.69%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 490966f...245b699. Read the comment docs.

@SebastienEske
Copy link
Author

Ok, I fixed what I could fix but I don't understand why the windows build is failing. I found this thread gipit/gippy#123 (comment)
Not sure how relevant it is. Any idea how to fix this last failing build?

Signed-off-by: Sebastien ESKENAZI <[email protected]>
Signed-off-by: Sebastien ESKENAZI <[email protected]>
Signed-off-by: Sebastien ESKENAZI <[email protected]>
Signed-off-by: Sebastien ESKENAZI <[email protected]>
Signed-off-by: Sebastien ESKENAZI <[email protected]>
Signed-off-by: Sebastien ESKENAZI <[email protected]>
Signed-off-by: Sebastien ESKENAZI <[email protected]>
@fmassa
Copy link
Member

fmassa commented Sep 19, 2019

Hi @SebastienEske

Thanks a lot for the PR!

I agree with the points made by @Noiredd (thanks for the review btw!) though.

I think in its current state, the functionality implemented here can be handled by a combination of ImageFolder datasets. Plus, we would need to check that there is a 1 -> 1 mapping between the images in different folders, to make sure that we are not returning wrong pairs of images (the only constraint we do is we assume that all the images have the same name, which could not be enough maybe?).

All in all, maybe none of #1330, #1334 and this will be merged, but they are all quite clear signals that there is indeed a need for a built-in solution for segmentation (or: multi-image) data.

Right to the point. Semantic segmentation and object detection are very important areas and we need to have better support for them in torchvision. We will spend some more time thinking through the possible solutions and I'll open an issue to discuss of an approach

Given the points just above, I believe we will not be going forward with this PR, but the discussions here have been very valuable!

PS: the windows test failures have been fixed in master, if you rebase your PR on top of master it would fix it.

@fmassa fmassa closed this Sep 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants