YOLOv5-OBB is a variant of YOLOv5 that supports oriented bounding boxes. This model is designed to yield predictions that better fit objects that are positioned at an angle.
X-AnyLabeling is not only an annotation tool, it’s a leap forward into the future of automated data annotation. It’s designed to not only simplify the process of annotation but also to integrate cutting-edge AI models for superior results. With a focus on practical applications, X-AnyLabeling strives to provide an industrial-grade, feature-rich tool that will assist developers in automating annotation and data processing for a wide range of complex tasks.
cd yolov5_obb
git submodule update --init --recursive
cd X-AnyLabeling
pip install -r requirements.txt
# pip install -r requirements-gpu.txt
python anylabeling/app.py
- Prepare a predefined category label file (refer to this).
- Click on the 'Format' option in the top menu bar, select 'DOTA' and import the file prepared in the previous step.
[Option-1] Basic usage
- Press the shortcut key "O" to create a rotation shape.
- Open edit mode (shortcut: "Ctrl+J") and click to select the rotation box.
- rotate the selected box via shortcut "zxcv", where:
- z: Large counterclockwise rotation
- x: Small counterclockwise rotation
- c: Small clockwise rotation
- v: Large clockwise rotation
[Option-2] Additionally, you can use the model to batch pre-label the current dataset.
- Press the shorcut key "Ctrl+A" to open the Auto-Labeling mode;
- Choose an appropriate default model or load a custom model.
- Press the shorcut key "Ctrl+M" to run all images once.
For more detail, you can refer to this document.
- Requirements
- Python 3.7+
- PyTorch ≥ 1.7
- CUDA 9.0 or higher
- Ubuntu 16.04/18.04
- please be aware that if you downloaded the source code from the origin repo, it is advisable to make necessary modifications to the poly_nms_cuda.cu file. Failing to do so will likely result in compilation issues.
- For Windows user, please refer to this issue if you have difficulty in generating utils/nms_rotated_ext.cpython-XX-XX-XX-XX.so)**
- Install
a. Create a conda virtual environment and activate it:
conda create -n yolov5_obb python=3.8 -y
source activate yolov5_obb
b. Make sure your CUDA runtime api version ≤ CUDA driver version. (for example 11.3 ≤ 11.4)
nvcc -V
c. Install PyTorch and torchvision following the official instructions based on your machine env, and make sure cudatoolkit version same as CUDA runtime api version, e.g.,
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
nvcc -V
>>> import torch
>>> torch.version.cuda
>>> exit()
d. Clone the modified version of the follow YOLOv5_OBB repository.
git clone https://github.com/CVHub520/yolov5_obb.git
cd yolov5_obb
e. Install yolov5-obb.
pip install -r requirements.txt
cd utils/nms_rotated
python setup.py develop # or "pip install -v -e ."
- DOTA_devkit [Optional]
If you need to split the high-resolution image and evaluate the oriented bounding boxes (OBB), it is recommended to use the following tool:
cd yolov5_obb/DOTA_devkit
sudo apt-get install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace
Prepare custom dataset files
Note: Ensure that the label format is [polygon classname difficulty], for example, you can set difficulty=0 unless otherwise specified.
x1 y1 x2 y2 x3 y3 x4 y4 classname diffcult
1686.0 1517.0 1695.0 1511.0 1711.0 1535.0 1700.0 1541.0 large-vehicle 1
Then, modify the path parameters and run this script if there is no need to split the high-resolution images. Otherwise, you can follow the steps below.
cd yolov5_obb
python DOTA_devkit/ImgSplit_multi_process.py
Ensure that your dataset is organized in the directory structure as shown below:
└── dataset_demo
├── images
│ └── P0032.png
└── labelTxt
└── P0032.txt
Finally, you can create a custom data yaml file, e.g., yolov5obb_demo.yaml.
- DOTA is a high resolution image dataset, so it needs to be splited before training/testing to get better performance.
- For single-class problems, it is recommended to add a "None" class, effectively making it a 2-class task, e.g., DroneVehicle_poly.yaml
Before formally starting the training task, please follow the following recommendations:
- Ensure that the input resolution is set to a multiple of 32.
- By default, set the batch size to 8. If you increase it to 16 or larger, adjust the scaling factor for the box loss to help the convergence of the
To train on multiple GPUs with Distributed Data Parallel (DDP) mode, please refer to this shell script.
To train the orignal dataset demo without split dataset, please refer to the following command:
python train.py \
--weights weights/yolov5n.pt \
--data data/task.yaml \
--hyp data/hyps/obb/hyp.finetune_dota.yaml \
--epochs 300 \
--batch-size 1 \
--img 1024 \
--device 0 \
--name /path/to/save_dir
- To detect a custom image file/folder/video, please refer to the following command:
python detect.py \
--weights /path/to/*.pt \
--source /path/to/image \
--img 1024 \
--device 0 \
--conf-thres 0.25 \
--iou-thres 0.2 \
--name /path/to/save_dir
For more details, please refer to this document.
- Export *.onnx file:
python export.py \
--weights runs/train/task/weights/best.pt \
--data data/task.yaml \
--imgsz 1024 \
--simplify \
--opset 12 \
--include onnx
- Detect with the exported onnx file using onnxruntime:
python deploy/onnxruntime/python/main.py \
--model /path/to/*.onnx \
--image /path/to/image
- Enter the directory:
cd opencv/cpp
└── cpp
├── CMakeLists.txt
├── build
├── image
│ ├── demo.jpg
├── main.cpp
├── model
│ └── yolov5m_obb_csl_dotav15.onnx
└── obb
├── include
└── src
Note, it is recommended to use OpenCV version 4.6.0 or newer, where v4.7.0 has been successfully tested.
- Place the images and model files in the specified directory.
- Modify the contents of the CMakeLists.txt, main.cpp, and yolo_obb.h files according to your specific requirements and use case.
- Run the demo:
mkdir build && cd build
cmake ..
The results on DOTA_subsize1024_gap200_rate1.0 test-dev set are shown in the table below. (password: yolo)
Model (download link) |
Size (pixels) |
TTA (multi-scale/ rotate testing) |
OBB mAPtest 0.5 DOTAv1.0 |
OBB mAPtest 0.5 DOTAv1.5 |
OBB mAPtest 0.5 DOTAv2.0 |
Speed CPU b1 (ms) |
Speed 2080Ti b1 (ms) |
Speed 2080Ti b16 (ms) |
params (M) |
FLOPs @640 (B) |
yolov5m [baidu/google] | 1024 | × | 77.3 | 73.2 | 58.0 | 328.2 | 16.9 | 11.3 | 21.6 | 50.5 |
yolov5s [baidu] | 1024 | × | 76.8 | - | - | - | 15.6 | - | 7.5 | 17.5 |
yolov5n [baidu] | 1024 | × | 73.3 | - | - | - | 15.2 | - | 2.0 | 5.0 |
Table Notes (click to expand)
- All checkpoints are trained to 300 epochs with COCO pre-trained checkpoints, default settings and hyperparameters.
- mAPtest dota values are for single-model single-scale on DOTA(1024,1024,200,1.0) dataset.
Reproduce Example:
python val.py --data 'data/dotav15_poly.yaml' --img 1024 --conf 0.01 --iou 0.4 --task 'test' --batch 16 --save-json --name 'dotav15_test_split'
python tools/TestJson2VocClassTxt.py --json_path 'runs/val/dotav15_test_split/best_obb_predictions.json' --save_path 'runs/val/dotav15_test_split/obb_predictions_Txt'
python DOTA_devkit/ResultMerge_multi_process.py --scrpath 'runs/val/dotav15_test_split/obb_predictions_Txt' --dstpath 'runs/val/dotav15_test_split/obb_predictions_Txt_Merged'
zip the poly format results files and submit it to https://captain-whu.github.io/DOTA/evaluation.html
- Speed averaged over DOTAv1.5 val_split_subsize1024_gap200 images using a 2080Ti gpu. NMS + pre-process times is included.
Reproduce bypython val.py --data 'data/dotav15_poly.yaml' --img 1024 --task speed --batch 1
Model Name | File Size | Input Size | Configuration File |
yolov5n_obb_drone_vehicle.onnx | 8.39MB | 864 | yolov5n_obb_drone_vehicle.yaml |
yolov5s_obb_csl_dotav10.onnx | 29.8MB | 1024 | dotav1_poly.yaml |
yolov5m_obb_csl_dotav15.onnx | 83.6MB | 1024 | dotav15_poly.yaml |
yolov5m_obb_csl_dotav20.onnx | 83.6MB | 1024 | dotav2_poly.yaml |
This project relies on the following open-source projects and resources: