Vehicle Detection with RetinaNet

Vehicle and pedestrian detection and tracking play a vital role in autonomous driving. In previous project, I implemented a vehicle detection and tracking pipeline based on traditional computer vision techniques. This project is to explore application of RetinaNet on the vehicle detection taask.

Dataset

The training and evaluation of this project is based on the Udacity annotated driving dataset. It includes driving in Mountain View California and neighboring cities during daylight conditions. I combined the two datasets and only retained bounding box annotations for car, truck, and pedestrian. The combined dataset

Here's an overview of the dataset

Model evalulation

In this project, I'm interested in the detection accuracy of the models as well as their inference speed. The goal is to find a model that can detect vehicles with good accuracy in real time.

The accuracy of models is primarily evaluated by mean Average Precision (mAP) and mean Average Recall (mAR) at IOU of 0.5.

The models being benchmarked are

sliding window method based on HOG feature and linear classifier
RetinaNet with ResNet50 backbone, pre-trained on COCO
RetinaNet with ResNet18 backbone, trained on driving dataset
RetinaNet with MobileNet backbone, trained on driving dataset

Main results

Benchmark

Model	AP50 (car)	AP50 (truck)	AP50 (pedestrian)	# of parameters	CPU inference (s/frame)	GPU inference (s/frame)
HOG	24.6	-	-	-	6.9
RetinaNet-ResNet50 pre-trained on COCO	71.8	53.4	32.4	37.4	2.0	0.14
RetinaNet-ResNet18-64	66.7	54.1	27.2	12.0	1.4	0.1
RetinaNet-ResNet18-48	66.1	51.0	18.8	7.0	1.2	0.09
RetinaNet-ResNet18-32	71.9	55.2	34.7	3.4	0.97	0.09
RetinaNet-MobileNet-1	73.3	54.6	42.4	4.4	1.1	0.1
RetinaNet-MobileNet-0.75	67.6	57.2	29.6	2.8	1.0	0.07
RetinaNet-MobileNet-0.5	65.3	55.2	36.3	1.6	0.77	0.055
RetinaNet-MobileNet-0.25	67.6	54.1	38.2	0.84	0.54	0.05

Example detection result

Vehicle tracking on movie

Here's the result of running RetinaNet-ResNet50-COCO on a dash camera video

Here's the result of running RetinaNet-MobileNet-0.25 on a dash camera video

Appendix

The following graph shows the structure of feature pyramid net (FPN) built on top of ResNet backbone.

The following graph showes the structure of regress and classification subnet.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
img		img
keras_retinanet		keras_retinanet
samples		samples
slide_window		slide_window
.gitignore		.gitignore
AP_evalulation.ipynb		AP_evalulation.ipynb
README.md		README.md
eval.py		eval.py
train_mobilenet.py		train_mobilenet.py
train_mobilnet.sh		train_mobilnet.sh
train_resnet.sh		train_resnet.sh
train_resnet18.py		train_resnet18.py
vehicle test.ipynb		vehicle test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vehicle Detection with RetinaNet

Dataset

Model evalulation