-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenCV transforms with tests #34
Conversation
@alykhantejani an option would be to try to import it and fill back to PIL, as for example done in chainercv https://github.com/chainer/chainercv/blob/master/chainercv/transforms/image/resize.py |
making this explicit such as CVResize, CVNormalize etc. is so much better. |
We already have two possible backends for I'd rather avoid using OpenCV for loading images (or at least would convert them to BGR), but we could eventually add another backend support in What do you think? |
@fmassa @alykhantejani Any updates on integration of opencv as a new backend ? I could work on it if you think it will be integrated one day. For me PIL is rather restrictive when working with multiband images other than uint8. What do you think ? |
@vfdev-5 I also want OpenCV as a backend, but I'm not sure how to add that without duplicating functional.py and transforms.py, one for PIL and one for OpenCV. Here is a snippet from my current workaround: def input_target_transform(image, target, height, width, thickness):
image = cv2.resize(image, (width, height), interpolation=cv2.INTER_AREA)
target = cv2.resize(target, (width, height), interpolation=cv2.INTER_AREA)
# degrees, translations, scale range, shear range, image size
angle, translate, scale, shear = RandomAffine.get_params(
(-15, 15), (0.03125, 0.03125), (1. / 1.1, 1.1), (0., 0.),
(width, height))
center = (width * 0.5 + 0.5, height * 0.5 + 0.5)
matrix = F._get_inverse_affine_matrix(
center, angle, translate, scale, shear)
M = np.array(matrix).reshape(2, 3)
image = cv2.warpAffine(
image, M, (width, height),
flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LINEAR)
target = cv2.warpAffine(
target, M, (width, height),
flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LINEAR) Any ideas on how to integrate OpenCV as a proper backend? I'd be willing to help out. |
Hi, So, one of the things I don't like that much about OpenCV compared to PIL or accimage is that OpenCV returns images in a different format than PIL (BGR instead of RGB), and we'd like for consistency to have all our operations return 0-1 RGB images. We could patch in some sense I'm afraid having to backends with different data representations might lead to hard to spot bugs in the user code if we don't do things carefully. |
@fmassa a workaround for this could be wrapping either PIL images and numpy images in a data structure so that the color space can be tracked down. An alternative could also be to add the color space conversion from BGR to RGB at loading time when the opencv backend is used. |
@edgarriba yes. that's true. This would involve a few more changes though, as we don't currently provide an So this means we would maintain our wrapper class This might work, but I'm unsure if there are downsides I'm not seeing now that could pop up down the road. |
@fmassa, as you said, as torchvision does not provide Today, If we integrate OpenCV as backend, then all classes transforming images, like We can, probably, create a new file What is the purpose of |
The purpose of I just think that letting to the user the responsability to chose between RGB-BGR might be error-prone. I don't think that |
@fmassa In my case, I want to combine existing functions from Would it be possible to make Finally, for OpenCV, we could have a |
@hjweide that sounds reasonable, and might be the way to go. But the fact that you have a dedicated function for converting from cv2 to Tensor makes your code not agnostic wrt the backend. For example, what if the user by mistake uses Also, if you want to use opencv functions for transforms, a simple way of doing it is to have a Also, one other reason why I am a bit worried about supporting OpenCV torchvision by default is that it doesnt' play well with multiprocessing in |
@fmassa I would go for the |
I do not quite understand why |
@fmassa I did not intend for the user code to be agnostic of the backend, only I understand your concern over opencv/opencv#5150. @edgarriba It's possible to avoid that issue by building OpenCV with Perhaps the right thing to do is to develop the OpenCV backend as a small, separate project. It will be clear that it's not officially supported and so the user is responsible for correctly building OpenCV and rearranging the color channels before using the modelzoo. That's effectively what I'm doing at the moment. |
@hjweide yeah, I'd rather not to duplicate the number of our transforms to support opencv, ideally they should have the same interface and be easily exchangeable, but as discussed, this might bring a few complications. |
Apologies for re-opening this thread, but I think OpenCV as a dependency is a bad idea. TBH I would rather have the user deal with the burden of using OpenCV and making sure it is converted to the right format as Right now, OpenCV is in a bit of a bad state in terms of CUDA support and A better idea is perhaps to have a note about OpenCV in the README? |
FYI, there's a proposed project in the GSoC at OpenCV to implement a data augmentation module. Besides, from |
@varunagrawal there is albumentations which provides transforms using OpenCV as backend similarly to |
Any update? |
Closing this PR as it is over 5 years old. Since then TorchVision offers GPU accelerated Torch-based transforms, so adding another backend is not currently in our plans. |
Pretty much yes. Since v0.8 TorchVision offers 2 backends. PIL and Tensor. The latter offers GPU acceleration, JIT-scriptability and other goodies. |
Have been training on ImageNet and CIFAR with these, faster and more transparent than PIL, but such a pain to install. A few differences with PIL-based:
np.ndarray
on input-output