Accelerating Experience Replay for Deep Q-Networks with Reduced Target Computation

doi:10.5121/csit.2023.130101

Volume 13, Number 01, January 2023

Accelerating Experience Replay for Deep Q-Networks with Reduced Target Computation

Authors

Bob Zigon¹ and Fengguang Song², ¹Beckman Coulter, USA, ²Indiana University-Purdue University, USA

Abstract

Mnih’s seminal deep reinforcement learning paper that applied a Deep Q-network to Atari video games demonstrated the importance of a replay buffer and a target network. Though the pair were required for convergence, the use of the replay buffer came at a significant computational cost. With each new sample generated by the system, the targets in the mini batch buffer were continually recomputed. We propose an alternative that eliminates the target recomputation called TAO-DQN (Target Accelerated Optimization-DQN). Our approach focuses on a new replay buffer algorithm that lowers the computational burden. We implemented this new approach on three experiments involving environments from the OpenAI gym. This resulted in convergence to better policies in fewer episodes and less time. Furthermore, we offer a mathematical justification for our improved convergence rate.

Keywords

DQN, Experience Replay, Replay Buffer, Target Network.

Subscription Membership AIRCC CSCP Contact Us
All Rights Reserved ® AIRCC