diff --git a/configs/yolo/README.md b/configs/yolo/README.md deleted file mode 100644 index 9cb47bcc8..000000000 --- a/configs/yolo/README.md +++ /dev/null @@ -1,55 +0,0 @@ -# YOLOv3 - -> [YOLOv3: An Incremental Improvement](https://arxiv.org/abs/1804.02767) - - - -## Abstract - -We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. - -
- -
- -## Results and Models - -| Backbone | Scale | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download | -| :--------: | :---: | :-----: | :------: | :------------: | :----: | :---------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| DarkNet-53 | 320 | 273e | 2.7 | 63.9 | 27.9 | [config](./yolov3_d53_8xb8-320-273e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-421362b6.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-20200819_172101.log.json) | -| DarkNet-53 | 416 | 273e | 3.8 | 61.2 | 30.9 | [config](./yolov3_d53_8xb8-ms-416-273e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-2b60fcd9.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-20200819_173424.log.json) | -| DarkNet-53 | 608 | 273e | 7.4 | 48.1 | 33.7 | [config](./yolov3_d53_8xb8-ms-608-273e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco_20210518_115020-a2c3acb8.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco_20210518_115020.log.json) | - -## Mixed Precision Training - -We also train YOLOv3 with mixed precision training. - -| Backbone | Scale | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download | -| :--------: | :---: | :-----: | :------: | :------------: | :----: | :-------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| DarkNet-53 | 608 | 273e | 4.7 | 48.1 | 33.8 | [config](./yolov3_d53_8xb8-amp-ms-608-273e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_fp16_mstrain-608_273e_coco/yolov3_d53_fp16_mstrain-608_273e_coco_20210517_213542-4bc34944.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_fp16_mstrain-608_273e_coco/yolov3_d53_fp16_mstrain-608_273e_coco_20210517_213542.log.json) | - -## Lightweight models - -| Backbone | Scale | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download | -| :---------: | :---: | :-----: | :------: | :------------: | :----: | :------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| MobileNetV2 | 416 | 300e | 5.3 | | 23.9 | [config](./yolov3_mobilenetv2_8xb24-ms-416-300e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_mobilenetv2_mstrain-416_300e_coco/yolov3_mobilenetv2_mstrain-416_300e_coco_20210718_010823-f68a07b3.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_mobilenetv2_mstrain-416_300e_coco/yolov3_mobilenetv2_mstrain-416_300e_coco_20210718_010823.log.json) | -| MobileNetV2 | 320 | 300e | 3.2 | | 22.2 | [config](./yolov3_mobilenetv2_8xb24-320-300e_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_mobilenetv2_320_300e_coco/yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_mobilenetv2_320_300e_coco/yolov3_mobilenetv2_320_300e_coco_20210719_215349.log.json) | - -Notice: We reduce the number of channels to 96 in both head and neck. It can reduce the flops and parameters, which makes these models more suitable for edge devices. - -## Credit - -This implementation originates from the project of Haoyu Wu(@wuhy08) at Western Digital. - -## Citation - -```latex -@misc{redmon2018yolov3, - title={YOLOv3: An Incremental Improvement}, - author={Joseph Redmon and Ali Farhadi}, - year={2018}, - eprint={1804.02767}, - archivePrefix={arXiv}, - primaryClass={cs.CV} -} -``` diff --git a/configs/yolo/metafile.yml b/configs/yolo/metafile.yml deleted file mode 100644 index 627e70c4d..000000000 --- a/configs/yolo/metafile.yml +++ /dev/null @@ -1,124 +0,0 @@ -Collections: - - Name: YOLOv3 - Metadata: - Training Data: COCO - Training Techniques: - - SGD with Momentum - - Weight Decay - Training Resources: 8x V100 GPUs - Architecture: - - DarkNet - Paper: - URL: https://arxiv.org/abs/1804.02767 - Title: 'YOLOv3: An Incremental Improvement' - README: configs/yolo/README.md - Code: - URL: https://github.com/open-mmlab/mmdetection/blob/v2.4.0/mmdet/models/detectors/yolo.py#L8 - Version: v2.4.0 - -Models: - - Name: yolov3_d53_320_273e_coco - In Collection: YOLOv3 - Config: configs/yolo/yolov3_d53_8xb8-320-273e_coco.py - Metadata: - Training Memory (GB): 2.7 - inference time (ms/im): - - value: 15.65 - hardware: V100 - backend: PyTorch - batch size: 1 - mode: FP32 - resolution: (320, 320) - Epochs: 273 - Results: - - Task: Object Detection - Dataset: COCO - Metrics: - box AP: 27.9 - Weights: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-421362b6.pth - - - Name: yolov3_d53_mstrain-416_273e_coco - In Collection: YOLOv3 - Config: configs/yolo/yolov3_d53_8xb8-ms-416-273e_coco.py - Metadata: - Training Memory (GB): 3.8 - inference time (ms/im): - - value: 16.34 - hardware: V100 - backend: PyTorch - batch size: 1 - mode: FP32 - resolution: (416, 416) - Epochs: 273 - Results: - - Task: Object Detection - Dataset: COCO - Metrics: - box AP: 30.9 - Weights: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-2b60fcd9.pth - - - Name: yolov3_d53_mstrain-608_273e_coco - In Collection: YOLOv3 - Config: configs/yolo/yolov3_d53_8xb8-ms-608-273e_coco.py - Metadata: - Training Memory (GB): 7.4 - inference time (ms/im): - - value: 20.79 - hardware: V100 - backend: PyTorch - batch size: 1 - mode: FP32 - resolution: (608, 608) - Epochs: 273 - Results: - - Task: Object Detection - Dataset: COCO - Metrics: - box AP: 33.7 - Weights: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco_20210518_115020-a2c3acb8.pth - - - Name: yolov3_d53_fp16_mstrain-608_273e_coco - In Collection: YOLOv3 - Config: configs/yolo/yolov3_d53_8xb8-amp-ms-608-273e_coco.py - Metadata: - Training Memory (GB): 4.7 - inference time (ms/im): - - value: 20.79 - hardware: V100 - backend: PyTorch - batch size: 1 - mode: FP16 - resolution: (608, 608) - Epochs: 273 - Results: - - Task: Object Detection - Dataset: COCO - Metrics: - box AP: 33.8 - Weights: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_fp16_mstrain-608_273e_coco/yolov3_d53_fp16_mstrain-608_273e_coco_20210517_213542-4bc34944.pth - - - Name: yolov3_mobilenetv2_8xb24-320-300e_coco - In Collection: YOLOv3 - Config: configs/yolo/yolov3_mobilenetv2_8xb24-320-300e_coco.py - Metadata: - Training Memory (GB): 3.2 - Epochs: 300 - Results: - - Task: Object Detection - Dataset: COCO - Metrics: - box AP: 22.2 - Weights: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_mobilenetv2_320_300e_coco/yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth - - - Name: yolov3_mobilenetv2_8xb24-ms-416-300e_coco - In Collection: YOLOv3 - Config: configs/yolo/yolov3_mobilenetv2_8xb24-ms-416-300e_coco.py - Metadata: - Training Memory (GB): 5.3 - Epochs: 300 - Results: - - Task: Object Detection - Dataset: COCO - Metrics: - box AP: 23.9 - Weights: https://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_mobilenetv2_mstrain-416_300e_coco/yolov3_mobilenetv2_mstrain-416_300e_coco_20210718_010823-f68a07b3.pth diff --git a/configs/yolo/yolov3_d53_8xb8-320-273e_coco.py b/configs/yolo/yolov3_d53_8xb8-320-273e_coco.py deleted file mode 100644 index a3d08dd77..000000000 --- a/configs/yolo/yolov3_d53_8xb8-320-273e_coco.py +++ /dev/null @@ -1,29 +0,0 @@ -_base_ = './yolov3_d53_8xb8-ms-608-273e_coco.py' - -input_size = (320, 320) -train_pipeline = [ - dict(type='LoadImageFromFile', backend_args={{_base_.backend_args}}), - dict(type='LoadAnnotations', with_bbox=True), - # `mean` and `to_rgb` should be the same with the `preprocess_cfg` - dict(type='Expand', mean=[0, 0, 0], to_rgb=True, ratio_range=(1, 2)), - dict( - type='MinIoURandomCrop', - min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), - min_crop_size=0.3), - dict(type='Resize', scale=input_size, keep_ratio=True), - dict(type='RandomFlip', prob=0.5), - dict(type='PhotoMetricDistortion'), - dict(type='PackDetInputs') -] -test_pipeline = [ - dict(type='LoadImageFromFile', backend_args={{_base_.backend_args}}), - dict(type='Resize', scale=input_size, keep_ratio=True), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='PackDetInputs', - meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', - 'scale_factor')) -] -train_dataloader = dict(dataset=dict(pipeline=train_pipeline)) -val_dataloader = dict(dataset=dict(pipeline=test_pipeline)) -test_dataloader = val_dataloader diff --git a/configs/yolo/yolov3_d53_8xb8-amp-ms-608-273e_coco.py b/configs/yolo/yolov3_d53_8xb8-amp-ms-608-273e_coco.py deleted file mode 100644 index 173d8ee22..000000000 --- a/configs/yolo/yolov3_d53_8xb8-amp-ms-608-273e_coco.py +++ /dev/null @@ -1,3 +0,0 @@ -_base_ = './yolov3_d53_8xb8-ms-608-273e_coco.py' -# fp16 settings -optim_wrapper = dict(type='AmpOptimWrapper', loss_scale='dynamic') diff --git a/configs/yolo/yolov3_d53_8xb8-ms-416-273e_coco.py b/configs/yolo/yolov3_d53_8xb8-ms-416-273e_coco.py deleted file mode 100644 index ca0127e83..000000000 --- a/configs/yolo/yolov3_d53_8xb8-ms-416-273e_coco.py +++ /dev/null @@ -1,28 +0,0 @@ -_base_ = './yolov3_d53_8xb8-ms-608-273e_coco.py' - -train_pipeline = [ - dict(type='LoadImageFromFile', backend_args={{_base_.backend_args}}), - dict(type='LoadAnnotations', with_bbox=True), - # `mean` and `to_rgb` should be the same with the `preprocess_cfg` - dict(type='Expand', mean=[0, 0, 0], to_rgb=True, ratio_range=(1, 2)), - dict( - type='MinIoURandomCrop', - min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), - min_crop_size=0.3), - dict(type='RandomResize', scale=[(320, 320), (416, 416)], keep_ratio=True), - dict(type='RandomFlip', prob=0.5), - dict(type='PhotoMetricDistortion'), - dict(type='PackDetInputs') -] -test_pipeline = [ - dict(type='LoadImageFromFile', backend_args={{_base_.backend_args}}), - dict(type='Resize', scale=(416, 416), keep_ratio=True), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='PackDetInputs', - meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', - 'scale_factor')) -] -train_dataloader = dict(dataset=dict(pipeline=train_pipeline)) -val_dataloader = dict(dataset=dict(pipeline=test_pipeline)) -test_dataloader = val_dataloader diff --git a/configs/yolo/yolov3_d53_8xb8-ms-608-273e_coco.py b/configs/yolo/yolov3_d53_8xb8-ms-608-273e_coco.py deleted file mode 100644 index d4a36dfda..000000000 --- a/configs/yolo/yolov3_d53_8xb8-ms-608-273e_coco.py +++ /dev/null @@ -1,167 +0,0 @@ -_base_ = ['../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'] -# model settings -data_preprocessor = dict( - type='DetDataPreprocessor', - mean=[0, 0, 0], - std=[255., 255., 255.], - bgr_to_rgb=True, - pad_size_divisor=32) -model = dict( - type='YOLOV3', - data_preprocessor=data_preprocessor, - backbone=dict( - type='Darknet', - depth=53, - out_indices=(3, 4, 5), - init_cfg=dict(type='Pretrained', checkpoint='open-mmlab://darknet53')), - neck=dict( - type='YOLOV3Neck', - num_scales=3, - in_channels=[1024, 512, 256], - out_channels=[512, 256, 128]), - bbox_head=dict( - type='YOLOV3Head', - num_classes=80, - in_channels=[512, 256, 128], - out_channels=[1024, 512, 256], - anchor_generator=dict( - type='YOLOAnchorGenerator', - base_sizes=[[(116, 90), (156, 198), (373, 326)], - [(30, 61), (62, 45), (59, 119)], - [(10, 13), (16, 30), (33, 23)]], - strides=[32, 16, 8]), - bbox_coder=dict(type='YOLOBBoxCoder'), - featmap_strides=[32, 16, 8], - loss_cls=dict( - type='CrossEntropyLoss', - use_sigmoid=True, - loss_weight=1.0, - reduction='sum'), - loss_conf=dict( - type='CrossEntropyLoss', - use_sigmoid=True, - loss_weight=1.0, - reduction='sum'), - loss_xy=dict( - type='CrossEntropyLoss', - use_sigmoid=True, - loss_weight=2.0, - reduction='sum'), - loss_wh=dict(type='MSELoss', loss_weight=2.0, reduction='sum')), - # training and testing settings - train_cfg=dict( - assigner=dict( - type='GridAssigner', - pos_iou_thr=0.5, - neg_iou_thr=0.5, - min_pos_iou=0)), - test_cfg=dict( - nms_pre=1000, - min_bbox_size=0, - score_thr=0.05, - conf_thr=0.005, - nms=dict(type='nms', iou_threshold=0.45), - max_per_img=100)) -# dataset settings -dataset_type = 'CocoDataset' -data_root = 'data/coco/' - -# Example to use different file client -# Method 1: simply set the data root and let the file I/O module -# automatically infer from prefix (not support LMDB and Memcache yet) - -# data_root = 's3://openmmlab/datasets/detection/coco/' - -# Method 2: Use `backend_args`, `file_client_args` in versions before 3.0.0rc6 -# backend_args = dict( -# backend='petrel', -# path_mapping=dict({ -# './data/': 's3://openmmlab/datasets/detection/', -# 'data/': 's3://openmmlab/datasets/detection/' -# })) -backend_args = None - -train_pipeline = [ - dict(type='LoadImageFromFile', backend_args=backend_args), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='Expand', - mean=data_preprocessor['mean'], - to_rgb=data_preprocessor['bgr_to_rgb'], - ratio_range=(1, 2)), - dict( - type='MinIoURandomCrop', - min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), - min_crop_size=0.3), - dict(type='RandomResize', scale=[(320, 320), (608, 608)], keep_ratio=True), - dict(type='RandomFlip', prob=0.5), - dict(type='PhotoMetricDistortion'), - dict(type='PackDetInputs') -] -test_pipeline = [ - dict(type='LoadImageFromFile', backend_args=backend_args), - dict(type='Resize', scale=(608, 608), keep_ratio=True), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='PackDetInputs', - meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', - 'scale_factor')) -] - -train_dataloader = dict( - batch_size=8, - num_workers=4, - persistent_workers=True, - sampler=dict(type='DefaultSampler', shuffle=True), - batch_sampler=dict(type='AspectRatioBatchSampler'), - dataset=dict( - type=dataset_type, - data_root=data_root, - ann_file='annotations/instances_train2017.json', - data_prefix=dict(img='train2017/'), - filter_cfg=dict(filter_empty_gt=True, min_size=32), - pipeline=train_pipeline, - backend_args=backend_args)) -val_dataloader = dict( - batch_size=1, - num_workers=2, - persistent_workers=True, - drop_last=False, - sampler=dict(type='DefaultSampler', shuffle=False), - dataset=dict( - type=dataset_type, - data_root=data_root, - ann_file='annotations/instances_val2017.json', - data_prefix=dict(img='val2017/'), - test_mode=True, - pipeline=test_pipeline, - backend_args=backend_args)) -test_dataloader = val_dataloader - -val_evaluator = dict( - type='CocoMetric', - ann_file=data_root + 'annotations/instances_val2017.json', - metric='bbox', - backend_args=backend_args) -test_evaluator = val_evaluator - -train_cfg = dict(max_epochs=273, val_interval=7) - -# optimizer -optim_wrapper = dict( - type='OptimWrapper', - optimizer=dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005), - clip_grad=dict(max_norm=35, norm_type=2)) - -# learning policy -param_scheduler = [ - dict(type='LinearLR', start_factor=0.1, by_epoch=False, begin=0, end=2000), - dict(type='MultiStepLR', by_epoch=True, milestones=[218, 246], gamma=0.1) -] - -default_hooks = dict(checkpoint=dict(type='CheckpointHook', interval=7)) - -# NOTE: `auto_scale_lr` is for automatically scaling LR, -# USER SHOULD NOT CHANGE ITS VALUES. -# base_batch_size = (8 GPUs) x (8 samples per GPU) -auto_scale_lr = dict(base_batch_size=64) diff --git a/configs/yolo/yolov3_mobilenetv2_8xb24-320-300e_coco.py b/configs/yolo/yolov3_mobilenetv2_8xb24-320-300e_coco.py deleted file mode 100644 index 07b393734..000000000 --- a/configs/yolo/yolov3_mobilenetv2_8xb24-320-300e_coco.py +++ /dev/null @@ -1,42 +0,0 @@ -_base_ = ['./yolov3_mobilenetv2_8xb24-ms-416-300e_coco.py'] - -# yapf:disable -model = dict( - bbox_head=dict( - anchor_generator=dict( - base_sizes=[[(220, 125), (128, 222), (264, 266)], - [(35, 87), (102, 96), (60, 170)], - [(10, 15), (24, 36), (72, 42)]]))) -# yapf:enable - -input_size = (320, 320) -train_pipeline = [ - dict(type='LoadImageFromFile', backend_args={{_base_.backend_args}}), - dict(type='LoadAnnotations', with_bbox=True), - # `mean` and `to_rgb` should be the same with the `preprocess_cfg` - dict( - type='Expand', - mean=[123.675, 116.28, 103.53], - to_rgb=True, - ratio_range=(1, 2)), - dict( - type='MinIoURandomCrop', - min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), - min_crop_size=0.3), - dict(type='Resize', scale=input_size, keep_ratio=True), - dict(type='RandomFlip', prob=0.5), - dict(type='PhotoMetricDistortion'), - dict(type='PackDetInputs') -] -test_pipeline = [ - dict(type='LoadImageFromFile', backend_args={{_base_.backend_args}}), - dict(type='Resize', scale=input_size, keep_ratio=True), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='PackDetInputs', - meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', - 'scale_factor')) -] -train_dataloader = dict(dataset=dict(dataset=dict(pipeline=train_pipeline))) -val_dataloader = dict(dataset=dict(pipeline=test_pipeline)) -test_dataloader = val_dataloader diff --git a/configs/yolo/yolov3_mobilenetv2_8xb24-ms-416-300e_coco.py b/configs/yolo/yolov3_mobilenetv2_8xb24-ms-416-300e_coco.py deleted file mode 100644 index 9a161b66f..000000000 --- a/configs/yolo/yolov3_mobilenetv2_8xb24-ms-416-300e_coco.py +++ /dev/null @@ -1,176 +0,0 @@ -_base_ = ['../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'] -# model settings -data_preprocessor = dict( - type='DetDataPreprocessor', - mean=[123.675, 116.28, 103.53], - std=[58.395, 57.12, 57.375], - bgr_to_rgb=True, - pad_size_divisor=32) -model = dict( - type='YOLOV3', - data_preprocessor=data_preprocessor, - backbone=dict( - type='MobileNetV2', - out_indices=(2, 4, 6), - act_cfg=dict(type='LeakyReLU', negative_slope=0.1), - init_cfg=dict( - type='Pretrained', checkpoint='open-mmlab://mmdet/mobilenet_v2')), - neck=dict( - type='YOLOV3Neck', - num_scales=3, - in_channels=[320, 96, 32], - out_channels=[96, 96, 96]), - bbox_head=dict( - type='YOLOV3Head', - num_classes=80, - in_channels=[96, 96, 96], - out_channels=[96, 96, 96], - anchor_generator=dict( - type='YOLOAnchorGenerator', - base_sizes=[[(116, 90), (156, 198), (373, 326)], - [(30, 61), (62, 45), (59, 119)], - [(10, 13), (16, 30), (33, 23)]], - strides=[32, 16, 8]), - bbox_coder=dict(type='YOLOBBoxCoder'), - featmap_strides=[32, 16, 8], - loss_cls=dict( - type='CrossEntropyLoss', - use_sigmoid=True, - loss_weight=1.0, - reduction='sum'), - loss_conf=dict( - type='CrossEntropyLoss', - use_sigmoid=True, - loss_weight=1.0, - reduction='sum'), - loss_xy=dict( - type='CrossEntropyLoss', - use_sigmoid=True, - loss_weight=2.0, - reduction='sum'), - loss_wh=dict(type='MSELoss', loss_weight=2.0, reduction='sum')), - # training and testing settings - train_cfg=dict( - assigner=dict( - type='GridAssigner', - pos_iou_thr=0.5, - neg_iou_thr=0.5, - min_pos_iou=0)), - test_cfg=dict( - nms_pre=1000, - min_bbox_size=0, - score_thr=0.05, - conf_thr=0.005, - nms=dict(type='nms', iou_threshold=0.45), - max_per_img=100)) -# dataset settings -dataset_type = 'CocoDataset' -data_root = 'data/coco/' - -# Example to use different file client -# Method 1: simply set the data root and let the file I/O module -# automatically infer from prefix (not support LMDB and Memcache yet) - -# data_root = 's3://openmmlab/datasets/detection/coco/' - -# Method 2: Use `backend_args`, `file_client_args` in versions before 3.0.0rc6 -# backend_args = dict( -# backend='petrel', -# path_mapping=dict({ -# './data/': 's3://openmmlab/datasets/detection/', -# 'data/': 's3://openmmlab/datasets/detection/' -# })) -backend_args = None - -train_pipeline = [ - dict(type='LoadImageFromFile', backend_args=backend_args), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='Expand', - mean=data_preprocessor['mean'], - to_rgb=data_preprocessor['bgr_to_rgb'], - ratio_range=(1, 2)), - dict( - type='MinIoURandomCrop', - min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), - min_crop_size=0.3), - dict(type='RandomResize', scale=[(320, 320), (416, 416)], keep_ratio=True), - dict(type='RandomFlip', prob=0.5), - dict(type='PhotoMetricDistortion'), - dict(type='PackDetInputs') -] -test_pipeline = [ - dict(type='LoadImageFromFile', backend_args=backend_args), - dict(type='Resize', scale=(416, 416), keep_ratio=True), - dict(type='LoadAnnotations', with_bbox=True), - dict( - type='PackDetInputs', - meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', - 'scale_factor')) -] - -train_dataloader = dict( - batch_size=24, - num_workers=4, - persistent_workers=True, - sampler=dict(type='DefaultSampler', shuffle=True), - batch_sampler=dict(type='AspectRatioBatchSampler'), - dataset=dict( - type='RepeatDataset', # use RepeatDataset to speed up training - times=10, - dataset=dict( - type=dataset_type, - data_root=data_root, - ann_file='annotations/instances_train2017.json', - data_prefix=dict(img='train2017/'), - filter_cfg=dict(filter_empty_gt=True, min_size=32), - pipeline=train_pipeline, - backend_args=backend_args))) -val_dataloader = dict( - batch_size=24, - num_workers=4, - persistent_workers=True, - drop_last=False, - sampler=dict(type='DefaultSampler', shuffle=False), - dataset=dict( - type=dataset_type, - data_root=data_root, - ann_file='annotations/instances_val2017.json', - data_prefix=dict(img='val2017/'), - test_mode=True, - pipeline=test_pipeline, - backend_args=backend_args)) -test_dataloader = val_dataloader - -val_evaluator = dict( - type='CocoMetric', - ann_file=data_root + 'annotations/instances_val2017.json', - metric='bbox', - backend_args=backend_args) -test_evaluator = val_evaluator - -train_cfg = dict(max_epochs=30) - -# optimizer -optim_wrapper = dict( - type='OptimWrapper', - optimizer=dict(type='SGD', lr=0.003, momentum=0.9, weight_decay=0.0005), - clip_grad=dict(max_norm=35, norm_type=2)) - -# learning policy -param_scheduler = [ - dict( - type='LinearLR', - start_factor=0.0001, - by_epoch=False, - begin=0, - end=4000), - dict(type='MultiStepLR', by_epoch=True, milestones=[24, 28], gamma=0.1) -] - -find_unused_parameters = True - -# NOTE: `auto_scale_lr` is for automatically scaling LR, -# USER SHOULD NOT CHANGE ITS VALUES. -# base_batch_size = (8 GPUs) x (24 samples per GPU) -auto_scale_lr = dict(base_batch_size=192)