Academy & Industry Research Collaboration Center (AIRCC)

Volume 9, Number 17, December 2019

Optimizing the Performance of Convolutional Neural Networks on Raspberry PI for
Real-time Object Detection

  Authors

Hyun Woo Jung, Hankuk Academy of Foreign Studies, Republic of Korea

  Abstract

Deep learning has facilitated major advancements in various fields including image detection. This paper is an exploratory study on improving the performance of Convolutional Neural Network (CNN) models in environments with limited computing resources, such as the Raspberry Pi. A pretrained state-of-art algorithm for doing near-real time object detection in videos, YOLO (“You-Only-Look-Once”) CNN model, was selected for evaluating strategies for optimizing the runtime performance. Various performance analysis tools provided by the Linux kernel were used to measure CPU time and memory footprint. Our results show that loop parallelization, static compilation of weights, and flattening of convolution layers reduce the total runtime by 85% and reduce memory footprint by 53% on a Raspberry Pi 3 device. These findings suggest that the methodological improvements proposed in this work can reduce the computational overload of running CNN models on devices with limited computing resources.

  Keywords

Deep Learning, Convolutional Neural Networks, Raspberry Pi, real-time object detection