OpenCV transforms with tests #34

szagoruyko · 2017-01-20T04:41:12Z

Have been training on ImageNet and CIFAR with these, faster and more transparent than PIL, but such a pain to install. A few differences with PIL-based:

only np.ndarray on input-output
no torch dependency
bicubic interpolation by default everywhere

alykhantejani · 2017-09-05T13:29:54Z

@fmassa @soumith was there a decision on this, did we want cv2 as a dependency?

szagoruyko · 2017-09-05T13:32:26Z

@alykhantejani an option would be to try to import it and fill back to PIL, as for example done in chainercv https://github.com/chainer/chainercv/blob/master/chainercv/transforms/image/resize.py

soumith · 2017-09-05T14:05:41Z

making this explicit such as CVResize, CVNormalize etc. is so much better.
Importing and falling back is a bad idea because the resize and other functions are not bitwise compatible between both libraries (CV and PIL). So the same user script might converge to different accuracies depending on what is installed on the system (which is not good).

fmassa · 2017-09-05T14:18:34Z

We already have two possible backends for vision (PIL and accimage), and I'm not sure if both are bitwise compatible either (even though they are only used for loading images at the moment)

I'd rather avoid using OpenCV for loading images (or at least would convert them to BGR), but we could eventually add another backend support in vision, and make use of those backends in the transforms. With a proper refactoring (which is going on in #240), I think it can have minimal code duplication.

What do you think?

vfdev-5 · 2018-02-11T21:42:49Z

@fmassa @alykhantejani Any updates on integration of opencv as a new backend ? I could work on it if you think it will be integrated one day. For me PIL is rather restrictive when working with multiband images other than uint8. What do you think ?

hjweide · 2018-03-03T22:29:06Z

@vfdev-5 I also want OpenCV as a backend, but I'm not sure how to add that without duplicating functional.py and transforms.py, one for PIL and one for OpenCV. Here is a snippet from my current workaround:

def input_target_transform(image, target, height, width, thickness):
    image = cv2.resize(image, (width, height), interpolation=cv2.INTER_AREA)
    target = cv2.resize(target, (width, height), interpolation=cv2.INTER_AREA)
    # degrees, translations, scale range, shear range, image size
    angle, translate, scale, shear = RandomAffine.get_params(
        (-15, 15), (0.03125, 0.03125), (1. / 1.1, 1.1), (0., 0.),
        (width, height))

    center = (width * 0.5 + 0.5, height * 0.5 + 0.5)
    matrix = F._get_inverse_affine_matrix(
        center, angle, translate, scale, shear)
    M = np.array(matrix).reshape(2, 3)

    image = cv2.warpAffine(
        image, M, (width, height),
        flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LINEAR)
    target = cv2.warpAffine(
        target, M, (width, height),
        flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LINEAR)

Any ideas on how to integrate OpenCV as a proper backend? I'd be willing to help out.

fmassa · 2018-03-05T10:56:47Z

Hi,

So, one of the things I don't like that much about OpenCV compared to PIL or accimage is that OpenCV returns images in a different format than PIL (BGR instead of RGB), and we'd like for consistency to have all our operations return 0-1 RGB images.

We could patch in some sense ToTensor to take that into account (using the image_backend option), but given that OpenCV returns numpy arrays, I'm not sure how robust such solution would be, given that we already handle numpy arrays in ToTensor, and the lingua-franca that we use for the input/output of most of our transforms is PIL images.

I'm afraid having to backends with different data representations might lead to hard to spot bugs in the user code if we don't do things carefully.
I'm open to suggestions.

edgarriba · 2018-03-05T11:57:01Z

@fmassa a workaround for this could be wrapping either PIL images and numpy images in a data structure so that the color space can be tracked down. An alternative could also be to add the color space conversion from BGR to RGB at loading time when the opencv backend is used.

fmassa · 2018-03-05T12:29:43Z

@edgarriba yes. that's true.

This would involve a few more changes though, as we don't currently provide an imread functionality in torchvision, and we let the users use whatever they want to open the images. But that could be added.

So this means we would maintain our wrapper class Image, which would hold the data of the image and we could have our backends perform operations in there directly.

This might work, but I'm unsure if there are downsides I'm not seeing now that could pop up down the road.

vfdev-5 · 2018-03-05T13:18:16Z

@fmassa, as you said, as torchvision does not provide imread functionality, the problem with RGB or BGR can be passed on the user side. In torchvision we can just specify the working format as RGB.

Today, transforms.py contains the most of transformations that works on PIL input, some on Tensors.

If we integrate OpenCV as backend, then all classes transforming images, like Resize should work on any image input: np.ndarray or PIL.Image.Image (accimage.Image)?

We can, probably, create a new file functional_cv2.py with equivalent methods. Or maybe, we can have a single entry with functional.py and backend dependent internal modules functional_pil.py, functional_cv2.py, functional_tensor.py?

What is the purpose of _image_backend?

fmassa · 2018-03-05T13:36:51Z

The purpose of image_backend is to support accimage, which is a faster alternative to PIL but with limited functionality.

I just think that letting to the user the responsability to chose between RGB-BGR might be error-prone.
And we would definitely want to keep the images that we pass in RGB format for compatibility with our modelzoo.

I don't think that np.ndarray (output of opencv) has enough information to represent the images without ambiguity, so as @edgarriba mentioned we could extend it with a colorspace field to at least know if we are in RGB or HSV spaces, for example.

hjweide · 2018-03-06T00:42:12Z

@fmassa In my case, I want to combine existing functions from transforms.py like Resize with functions like distanceTransform from OpenCV. Thus, if the image is wrapped in an Image class, I'd have to unwrap it anyway, which is what I'm trying to avoid by using an OpenCV backend.

Would it be possible to make functional.py completely agnostic of the backend? If so, we could handle all the parameters, checks, and equations there, and then implement two versions of transforms.py, one for OpenCV and one for PIL (like @vfdev-5 suggested). There would be minimal code duplication, because parameters would be handled and the necessary transformations generated in functional.py, and then the actual transformation applied in the transforms.py corresponding to PIL or OpenCV.

Finally, for OpenCV, we could have a cv2ToTensor that converts the numpy array to RGB before converting it to a Tensor to ensure consistency.

fmassa · 2018-03-07T10:18:00Z

@hjweide that sounds reasonable, and might be the way to go.

But the fact that you have a dedicated function for converting from cv2 to Tensor makes your code not agnostic wrt the backend. For example, what if the user by mistake uses ToTensor? This will be a subtle bug, because there will be no error messages, just the color channels will be inverted.

Also, if you want to use opencv functions for transforms, a simple way of doing it is to have a LambdaTransform that converts to numpy, calls the opencv function and back.

Also, one other reason why I am a bit worried about supporting OpenCV torchvision by default is that it doesnt' play well with multiprocessing in fork mode, so only works for Python 3+, see pytorch/pytorch#1838 and opencv/opencv#5150

edgarriba · 2018-03-07T14:32:09Z

@fmassa I would go for the Image representation model. However, that's true we should find a stable workaround for the multprocessing issue. /cc @alalek @mshabunin any insights ?

vfdev-5 · 2018-03-08T10:13:59Z

I do not quite understand why Image representation model can save from RGB/BGR error. It still remains on user responsibilty to properly define an instanse of new Image specifying proper colorspace etc if data comes from "somewhere". Similar like F.to_pil_image accepts any np.ndarray (yes, we can hint a mode). Anyway, colorspace problem is relevant to color augmentations and for geometric ones I think it is OK for any colorspace.

hjweide · 2018-03-09T01:27:33Z

@fmassa I did not intend for the user code to be agnostic of the backend, only functional.py. The user would be responsible for composing [cv2Resize, ..., cv2ToTensor]if they choose to use the OpenCV backend. Thanks for the LambdaTransform suggestion.

I understand your concern over opencv/opencv#5150. @edgarriba It's possible to avoid that issue by building OpenCV with "-D BUILD_TBB=ON -D WITH_TBB=ON", but requiring users to build OpenCV from source is probably not an ideal solution.

Perhaps the right thing to do is to develop the OpenCV backend as a small, separate project. It will be clear that it's not officially supported and so the user is responsible for correctly building OpenCV and rearranging the color channels before using the modelzoo. That's effectively what I'm doing at the moment.

fmassa · 2018-03-14T20:45:40Z

@hjweide yeah, I'd rather not to duplicate the number of our transforms to support opencv, ideally they should have the same interface and be easily exchangeable, but as discussed, this might bring a few complications.

varunagrawal · 2019-03-12T19:51:00Z

Apologies for re-opening this thread, but I think OpenCV as a dependency is a bad idea. TBH I would rather have the user deal with the burden of using OpenCV and making sure it is converted to the right format as ndarray and then pass that to Transforms etc.

Right now, OpenCV is in a bit of a bad state in terms of CUDA support and pytorch and torchision are large enough projects to deal with already without the additional dependency.

A better idea is perhaps to have a note about OpenCV in the README?

edgarriba · 2019-03-12T21:07:42Z

FYI, there's a proposed project in the GSoC at OpenCV to implement a data augmentation module.
https://github.com/opencv/opencv/wiki/GSoC_2019#idea-computer-vision-data-augmentation-module

Besides, from torchgeometry there's also a plan to cover part of the current torchvision transforms by using pure pytorch operators to avoid using any image backend.

vfdev-5 · 2019-03-12T21:15:07Z

@varunagrawal there is albumentations which provides transforms using OpenCV as backend similarly to torchvision and which is intended to work and works with PyTorch.

sisrfeng · 2022-03-26T04:13:48Z

Any update?

datumbox · 2022-03-26T10:35:02Z

Closing this PR as it is over 5 years old. Since then TorchVision offers GPU accelerated Torch-based transforms, so adding another backend is not currently in our plans.

sisrfeng · 2022-03-26T13:21:52Z

Closing this PR as it is over 5 years old. Since then TorchVision offers GPU accelerated Torch-based transforms, so adding another backend is not currently in our plans.

You mean this?

datumbox · 2022-03-26T18:09:58Z

Pretty much yes. Since v0.8 TorchVision offers 2 backends. PIL and Tensor. The latter offers GPU acceleration, JIT-scriptability and other goodies.

szagoruyko added 2 commits April 1, 2017 20:47

add cvtransforms

f4b17fd

comments

ffe0afc

szagoruyko force-pushed the opencv branch from c7fa1ec to ffe0afc Compare April 15, 2017 14:40

emericit mentioned this pull request Jan 27, 2020

Illegal instruction (core dumped) with some pretrained models (but not all) #1782

Closed

facebook-github-bot added the cla signed label Oct 30, 2020

datumbox closed this Mar 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenCV transforms with tests #34

OpenCV transforms with tests #34

szagoruyko commented Jan 20, 2017

alykhantejani commented Sep 5, 2017

szagoruyko commented Sep 5, 2017

soumith commented Sep 5, 2017

fmassa commented Sep 5, 2017 •

edited

Loading

vfdev-5 commented Feb 11, 2018

hjweide commented Mar 3, 2018

fmassa commented Mar 5, 2018

edgarriba commented Mar 5, 2018

fmassa commented Mar 5, 2018

vfdev-5 commented Mar 5, 2018 •

edited

Loading

fmassa commented Mar 5, 2018

hjweide commented Mar 6, 2018

fmassa commented Mar 7, 2018

edgarriba commented Mar 7, 2018

vfdev-5 commented Mar 8, 2018

hjweide commented Mar 9, 2018

fmassa commented Mar 14, 2018

varunagrawal commented Mar 12, 2019

edgarriba commented Mar 12, 2019

vfdev-5 commented Mar 12, 2019

sisrfeng commented Mar 26, 2022

datumbox commented Mar 26, 2022

sisrfeng commented Mar 26, 2022

datumbox commented Mar 26, 2022

OpenCV transforms with tests #34

OpenCV transforms with tests #34

Conversation

szagoruyko commented Jan 20, 2017

alykhantejani commented Sep 5, 2017

szagoruyko commented Sep 5, 2017

soumith commented Sep 5, 2017

fmassa commented Sep 5, 2017 • edited Loading

vfdev-5 commented Feb 11, 2018

hjweide commented Mar 3, 2018

fmassa commented Mar 5, 2018

edgarriba commented Mar 5, 2018

fmassa commented Mar 5, 2018

vfdev-5 commented Mar 5, 2018 • edited Loading

fmassa commented Mar 5, 2018

hjweide commented Mar 6, 2018

fmassa commented Mar 7, 2018

edgarriba commented Mar 7, 2018

vfdev-5 commented Mar 8, 2018

hjweide commented Mar 9, 2018

fmassa commented Mar 14, 2018

varunagrawal commented Mar 12, 2019

edgarriba commented Mar 12, 2019

vfdev-5 commented Mar 12, 2019

sisrfeng commented Mar 26, 2022

datumbox commented Mar 26, 2022

sisrfeng commented Mar 26, 2022

datumbox commented Mar 26, 2022

fmassa commented Sep 5, 2017 •

edited

Loading

vfdev-5 commented Mar 5, 2018 •

edited

Loading