Deniz Kruppe
Optimizing the Energy Efficiency of YOLO-based Object Detection on FPGAs

Abstract
YOLO (You Only Look Once) is a family of state-of-the-art object detection systems that allow frame rates above real time (> 30 FPS). Over the past decade, there has been increasing interest in using FPGAs (Field Programmable Gate Arrays) for deep learning applications. Given the sufficiently sophisticated algorithms of object detection, FPGAs are typically much faster than CPUs, have a lower power consumption than GPUs and allow for faster development time and lower development costs than ASICs (Application- Specific Integrated Circuits) when flexible reconfiguration is desired. In this thesis, different implementations of the Xilinx DPU (Deep Learning Processor Unit), a hardware accelerator for deep learning, have been created via the Vitis-AI tool chain for a design space exploration with different parameters. Additionally, YOLOv4 has been trained with a custom image data set and deployed to the different hardware implementations. The power consumption, used hardware resources, detection accuracy and processing speed of the results were benchmarked.