Ryan Shen1 and Andrew Park2, 1USA, 2California State Polytechnic University, USA
This paper presents a Unity-based reinforcement learning system for simulating rocket descent and landing. Leveraging the Unity ML-Agents framework, our approach applies Proximal Policy Optimization (PPO) combined with imitation learning to balance exploration with guided behavior [8]. Unlike prior works, our system introduces vertical dynamics, randomized initial conditions to reduce overfitting, and variable environmental factors such as gravity, drag, rocket mass, and thruster power. We further refine the reward structure by incorporating precision- and time-based incentives, including a “bullseye bonus” for accuracy and a time bonus for efficiency. Experimental results show that our rocket agents achieve competitive success rates compared to existing implementations, even under more complex conditions. By extending Unity’s simulation environment with both technical rigor and user-oriented design, this work contributes to advancing reinforcement learning applications in aerospace while also promoting accessibility and engagement for broader audiences interested in space exploration technologies [9].
Unity, Machine Learning, Rockets, Landing