Current way to use torchvision.prototype.transforms #7168

austinmw · 2023-02-02T15:47:41Z

📚 The doc issue

I tried to run the end-to-end example in this recent blog post:

import PIL
from torchvision import io, utils
from torchvision.prototype import features, transforms as T
from torchvision.prototype.transforms import functional as F
# Defining and wrapping input to appropriate Tensor Subclasses
path = "COCO_val2014_000000418825.jpg"
img = features.Image(io.read_image(path), color_space=features.ColorSpace.RGB)
# img = PIL.Image.open(path)
bboxes = features.BoundingBox(
    [[2, 0, 206, 253], [396, 92, 479, 241], [328, 253, 417, 332],
     [148, 68, 256, 182], [93, 158, 170, 260], [432, 0, 438, 26],
     [422, 0, 480, 25], [419, 39, 424, 52], [448, 37, 456, 62],
     [435, 43, 437, 50], [461, 36, 469, 63], [461, 75, 469, 94],
     [469, 36, 480, 64], [440, 37, 446, 56], [398, 233, 480, 304],
     [452, 39, 463, 63], [424, 38, 429, 50]],
    format=features.BoundingBoxFormat.XYXY,
    spatial_size=F.get_spatial_size(img),
)
labels = features.Label([59, 58, 50, 64, 76, 74, 74, 74, 74, 74, 74, 74, 74, 74, 50, 74, 74])
# Defining and applying Transforms V2
trans = T.Compose(
    [
        T.ColorJitter(contrast=0.5),
        T.RandomRotation(30),
        T.CenterCrop(480),
    ]
)
img, bboxes, labels = trans(img, bboxes, labels)
# Visualizing results
viz = utils.draw_bounding_boxes(F.to_image_tensor(img), boxes=bboxes)
F.to_pil_image(viz).show()

but found that torchvision.prototype.features is now gone. What's the current way to run this? I attempted to simply pass the images, bboxes and labels with the following types: torchvision.prototype.datasets.utils._encoded.EncodedImage, torchvision.prototype.datapoints._bounding_box.BoundingBox, torchvision.prototype.datapoints._label.Label. However this didn't seem to apply the transforms as everything remained the same shape.

edit: I've found that features seems to be renamed to datapoints. I tried applying this, but EncodedImage in a coco sample['image'] seems to be 1D and prototype.transforms requires 2D images. What's the proper way to get this as 2D so I can apply transforms? Is there a decode method I'm missing?

Suggest a potential alternative/fix

No response

cc @vfdev-5 @bjuncek @pmeier

The text was updated successfully, but these errors were encountered:

pmeier · 2023-02-02T21:04:47Z

I've found that features seems to be renamed to datapoints.

Correct. That happened in #7002.

I tried applying this, but EncodedImage in a coco sample['image'] seems to be 1D and prototype.transforms requires 2D images.

All development on torchvision.prototype.datasets is on hold and thus there might be some incompatibilities. You can find our proposal on how to use the datasets v1 with the transforms v2 in #6662. We have a PoC implementation in #6663 that I'm actively working on. Happy to get your feedback there regarding this link between the two.

What's the proper way to get this as 2D so I can apply transforms? Is there a decode method I'm missing?

Our idea was for the prototype datasets to just return the raw bytes so decoding can happen however the user likes. In #6944 we made a cut and separated datasets from transforms to focus on the latter. In that PR we also removed the decoding transforms that linked the two. Here is the relevant part from the state right before the PR was merged:

vision/torchvision/prototype/transforms/functional/_type_conversion.py

Lines 12 to 17 in 65769ab

    
           @torch.jit.unused 
        
           def decode_image_with_pil(encoded_image: torch.Tensor) -> features.Image: 
        
               image = torch.as_tensor(np.array(PIL.Image.open(ReadOnlyTensorBuffer(encoded_image)), copy=True)) 
        
               if image.ndim == 2: 
        
                   image = image.unsqueeze(2) 
        
               return features.Image(image.permute(2, 0, 1))

vision/torchvision/prototype/transforms/_type_conversion.py

Lines 12 to 16 in 65769ab

    
           class DecodeImage(Transform): 
        
               _transformed_types = (features.EncodedImage,) 
        
               def _transform(self, inpt: torch.Tensor, params: Dict[str, Any]) -> features.Image: 
        
                   return F.decode_image_with_pil(inpt)  # type: ignore[no-any-return]

Substituting datapoints for features in your and removing the color_space parameter from the datapoints.Image instantiation (this happened in #7120) should be sufficient to get the example working again. As such I'm closing this. If you have general questions or feedback, the thread in #6753 might be also be of interest.

austinmw · 2023-02-02T21:11:34Z

Thanks, appreciate your response!

pmeier closed this as completed Feb 2, 2023

pmeier added question module: transforms prototype labels Feb 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current way to use torchvision.prototype.transforms #7168

Current way to use torchvision.prototype.transforms #7168

austinmw commented Feb 2, 2023 •

edited by pytorch-bot bot

Loading

pmeier commented Feb 2, 2023

austinmw commented Feb 2, 2023

Current way to use torchvision.prototype.transforms #7168

Current way to use torchvision.prototype.transforms #7168

Comments

austinmw commented Feb 2, 2023 • edited by pytorch-bot bot Loading

📚 The doc issue

Suggest a potential alternative/fix

pmeier commented Feb 2, 2023

austinmw commented Feb 2, 2023

austinmw commented Feb 2, 2023 •

edited by pytorch-bot bot

Loading