×
Hierarchical Deep Reinforcement Learning with Spatiotemporal Encoding for Resilient, Decentralized Multi-Agent Tasking in 2d Environments

Authors

Hubert Kyeremateng-Boateng , Anthony Herron , Oluwabukunmi David Jaiyeola , Seonho Choi, and Darsana Josyula , Bowie State University, Maryland, USA

Abstract

Multi-agent task allocation under strict temporal constraints poses fundamental challenges. We consider a fully decentralized setting: agents operate on a shared 2D grid, each observing only its own state and assigned task set, and acting in real time without centralized coordination. Hard deadlines impose zero-yield penalties on late harvests, making deadline satisfaction a hard constraint rather than a soft incentive. We present a hierarchical deep reinforcement learning (HDRL) framework that integrates explicit spatiotemporal encoding with a decentralized neural retasking mechanism. Specialized neural encoders capture distance-aware spatial relationships and deadline-sensitive temporal dependencies as first-class architectural components, enabling agents to trade off immediate task value against time criticality and user-defined priorities. A lightweight retasking network performs real-time reassignment of orphaned tasks upon agent failure without centralized coordination. We evaluate the framework on a multi-agent flower-harvesting benchmark featuring soft and hard deadlines, dynamic priority constraints, and a 30% stochastic agent-failure rate, comparing Double Deep Q-Networks (DDQN) and REINFORCE Policy Gradient (PG) against a Mixed-Integer Linear Programming (MILP) optimum. Results reveal a horizon-dependent crossover: DDQN leads at short time budgets ( timesteps), while PG surpasses DDQN and exceeds the MILP optimum from onward—reaching 117.8% of the MILP objective at by recovering from agent failures that the static MILP cannot anticipate. These results establish hierarchical spatiotemporal encoding as a viable approach for real-time multi-agent coordination in robotics, logistics, and autonomous systems.

Keywords

multi-agent reinforcement learning, hierarchical reinforcement learning, spatiotemporal encoding, task allocation, fault tolerance, DDQN