Xingtong Zou 1 and Jonathan Sahagun 2 , 1 USA, 2 California State University, USA
Remote-controlled robotic vehicles increasingly require augmented reality capabilities for enhanced operator situational awareness, yet affordable platforms combining real-time video streaming with AR overlay remain scarce [10]. This project presents an integrated system combining an ESP32-CAM robot car with a Unity iOS application featuring OpenCV-based marker detection. The ESP32 firmware provides MJPEG video streaming and PWM motor control through a WiFi access point architecture, while the mobile application implements ORB feature matching for rotation and scale-invariant marker detection with 3D object overlay.Key challenges addressed include video streaming latency optimization through multi-threaded decoding, computational efficiency through frame skipping and descriptor caching, and hardware resource conflicts through careful LEDC timer allocation. Experimental evaluation demonstrated mean video latency of 142-251ms depending on distance, and detection accuracy of 98.2% under optimal conditions. Comparison with ORB-SLAM, ArUco, and IoT streaming research highlights this system's unique combination of accessibility, flexibility, and integrated functionality [1]. The solution enables sophisticated AR robotics using components costing under $50.
Augmented Reality Robotics, Real-Time Video Streaming, Computer Vision Marker Detection, Embedded IoT Systems