×
A Multimodal Driver-Safety System to Support Teen Drivers Using Computer Vision and Sensor Fusion

Authors

Max Liu1, Andrew Park 2, 1 USA ,2 University of California, USA

Abstract

Teen drivers experience disproportionately high crash rates, largely due to inexperience and inconsistent attention to basic traffic rules. To address this issue, we developed a multimodal driver-safety coaching system deployed on a low-cost Raspberry Pi platform. The system uses three trained YOLO-based computer vision models to detect traffic lights, illuminated bulbs within traffic lights, and traffic signs. An Optical Character Recognition (OCR) module extracts numerical speed limit values, which are combined with GPS data to identify speeding behavior, while a trained audio classification model and IMU data are used to determine whether turn signals are activated during turning maneuvers. Post-detection processing techniques are applied to smooth noisy detections over time and trigger prioritized voice alerts via text-to-speech (TTS). Key challenges include achieving sufficient model accuracy under diverse environmental conditions, maintaining real-time performance on resource-constrained hardware, and coordinating multiple hardware devices through sensor fusion. Experimental results demonstrate both the strengths and limitations of the system, guiding future improvements. Overall, the proposed system illustrates a practical, low-cost approach to helping novice drivers develop safer driving behaviors in real-world settings.

Keywords

Computer vision, Multimodal sensor fusion, Driver behavior monitoring