-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add Mosaic augmentation #1147
Conversation
Any updates on this? |
I tried this out together with some rotation augmentations and seems to work @i-aki-y . However, from time to time this error arise:
when using COCO. Any idea how to fix this @i-aki-y? Moreover, the structure of the repo must have changed since the PR was created as some refactoring was needed. |
Couldn't this be solved by using |
Setting the |
@mikel-brostrom Sorry for delaying and thank you for your feedback!
This means that some bboxes have zero or minus heights.
OK. I will check and update the PR.
I missed this feature! I will look into this if I can use this for this transform. |
I have it working locally so I could create pull request with what I have @i-aki-y . I also have MixUP working as you suggested:
Maybe this should go into a separate PR? |
Yes. I had multiple augmentations in my augmentation stack but it always appeared during Mosiac. The case was that when the error emerged |
Thanks, I will investigate it.
Great!
Yes. |
Btw, I have implemented this in a slightly different manner. class Mosaic(DualTransform):
def __init__(
self,
height,
width,
replace=True,
fill_value=0,
bboxes_format="coco",
always_apply=False,
p=0.5,
):
super().__init__(always_apply=always_apply, p=p)
self.height = height
self.width = width
self.replace = replace
self.fill_value = fill_value
self.bboxes_format = bboxes_format
self.images = []
self.bboxes = []
def get_transform_init_args_names(self) -> Tuple[str, ...]:
return ("height", "width", "replace", "fill_value", "bboxes_format")
def apply(self, image, **params):
return mosaic4(self.images, self.height, self.width, self.fill_value)
def apply_to_keypoint(self, **params):
pass # TODO
def apply_to_bbox(self, bbox, image_shape, position, height, width, **params):
rows, cols = image_shape[:2]
return bbox_mosaic4(bbox, rows, cols, position, height, width)
def apply_to_bboxes(self, bboxes, **params):
new_bboxes = []
for i, (bbox, im) in enumerate(zip(self.bboxes, self.images)):
im_shape = im.shape
h, w, _ = im_shape
for b in bbox:
new_bbox = self.apply_to_bbox(b, im_shape, i, self.height, self.width)
new_bboxes.append(new_bbox)
return new_bboxes
def get_params_dependent_on_targets(self, params: Dict[str, Any]) -> Dict[str, Any]:
self.images = [params['image'], params['image1'], params['image2'], params['image3']]
self.bboxes = [params['bboxes'], params['bboxes1'], params['bboxes2'], params['bboxes3']]
images_bboxes = list(zip(self.images, self.bboxes))
random.shuffle(images_bboxes)
self.images, self.bboxes = zip(*images_bboxes)
return {}
@property
def targets_as_params(self) -> List[str]:
return [
"image", "image1", "image2", "image3",
"bboxes", "bboxes1", "bboxes2", "bboxes3"
] Trying to follow the recommended way of working with multiple images and bboxes. I however see that |
I found some coco annotations have bboxes with height == 0.0. import json
import pathlib
coco_annot_path = pathlib.Path("coco/annotations/instances_train2017.json")
with open(coco_annot_path) as f:
coco_annots = json.load(f)
for item in coco_annots["annotations"]:
x, y, w, h = item["bbox"]
if w == 0 or h == 0:
print(item)
> {'segmentation': [[296.65, 388.33, 296.65, 388.33, 297.68, 388.33, 297.68, 388.33]], 'area': 0.0, 'iscrowd': 0, 'image_id': 200365, 'bbox': [296.65, 388.33, 1.03, 0.0], 'category_id': 58, 'id': 918}
> {'segmentation': [[9.98, 188.56, 15.52, 188.56, 15.52, 188.56, 11.09, 188.56]], 'area': 0.0, 'iscrowd': 0, 'image_id': 550395, 'bbox': [9.98, 188.56, 5.54, 0.0], 'category_id': 1, 'id': 2206849} |
But this should be avoided by |
@mikel-brostrom No, the filters are applied in the post-processing, while the error occurs in pre-processing validation. I think they have different purposes. |
About PR
In this PR, I implemented a mosaic augmentation used in YOLO[1, 2].
I appreciate any comment and suggetsion.
[1]: "YOLOv 4 : Optimal speed and accuracy of object detection.", https://arxiv.org/pdf/2004.10934.pdf
[2]: YOLOv5 https://github.com/ultralytics/yolov5
Demo
This is a reproducable example:
Input
Some Results
Notes
Since current albumentations do not support multiple image sources, I introduced helper targets,
image_cache
,bboxes_cache
, as additional data sources. The user needs to set additional images and bboxes these helper targets. So, it is up to users to decide how to prepare and manage multiple image data. This means that users can set all images to theimage_cache
when the user has sufficient memory, or the dataset is small. On the other hand, the user can read a small number of images for each iteration.Note that this PR version does not support the labels_cache target. This means that the user should embed the label information inside the bounding boxes like
[xmin, ymin, xmax, ymax, label]
(whenbboxes_format=pascal_voc
).Another limitation is that the Mosaic augmentation should be placed at the first transform of the pipeline like the above example. Because the transforms placed before the mosaic would be applied only to the image set to
image
target while the additional images set toimage_cache
are just ignored. This means images set toimage
andimage_cache
will have different augmentation histories. I think this is not what users expect. For example, with the following pipeline,Normalize
andRandomResizedCrop
will be applied only to the image, not any image_cache.I think this is not a serious limitation because users can prepare two pipelines and apply them separately if needed.
The same strategy can be applied for other multiple image augmentation like MixUp.
For example, I think a similar augmentation used in YOLO5 will be given in the following way.
Implementation Notes
The
target_dependence
property is usedI used
target_dependence
property to pass the helper targets to theapply_xxx
functions instead ofget_params_dependent_on_targets
.This is because the returned values of
get_parameters
andget_params_dependent_on_targets
will become the targets of serialization, which is a mechanism used for "replay". Since I think that these helper targets are not appropriate for serialization, I used thetarget_dependence
property mechanism instead.Mosaic center is fixed
YOLO5 implementation includes randomization of the mosaic center position.
https://github.com/ultralytics/yolov5/blob/7c6a33564a84a0e78ec19da66ea6016d51c32e0a/utils/datasets.py#L653
I excluded this feature from the PR version because the same effect can be obtained by applying
RandomResizedCrop
just after theMosaic
as the above demo example.TODO
apply_to_keypoints
Compose
if possible.