Optimizing Dynamic Routing in 6G Networks using a Multi-Agent Multi-Step Deep Qlearning Algorithm

Volume 18, Number 3

Optimizing Dynamic Routing in 6G Networks using a Multi-Agent Multi-Step Deep Qlearning Algorithm

Authors

Nachimuthu Senthil and Sumathiarumugam, KPR College of Arts, India

Abstract

The proliferation of 6G networks poses new challenges to traditional methods of network management due to the massive volumes of data generated and the wide variety of devices they link. A paradigm change towards AI frameworks built in Machine Learning (ML) and Deep Learning (DL) is essential due to the shortcomings of these approaches. A Speed-optimized Attention-based Hybrid Graph Convolutional Network-Long Short-Term Memory (SPAH-GCN-LSTM) model and a Reinforcement Learning (RL) framework utilizing Q-Learning (QL) were developed to forecast network congestion and enhance data transmission routes, respectively. Nevertheless, in a dynamic network, uncertainty in routing decisions could be caused by the switching between policies by a single agent. The time spent training one agent is prohibitive with the increase of the size of the network. In spite of the fact that multi-agent RLhas been utilized to alleviate this problem, classical QL can potentially face the challenge of non-stationarity that arises because of the joint learning of other agents in a multi-agent stochastic-game environment. As a result, the present manuscript presents a Multi-Agent Multi-Step Deep QL(MAMS-DQL) system that is aimed at optimizing Washington routes in 6G networks. The main goal is to come up with a decentralized mechanism where every agent is able to choose its best routing strategy independently. The multi-agent dueling deep Q-network architecture is followed in this method so as to optimize routing decisions and identify the most efficient route of the network. It also uses a multi-step experience-replay strategy, which allows agents to modify their routing strategy by taking advantage of multi-step experiences across consecutive time steps of training. Lastly, the outcomes of the simulator show that the MAMS-DQL has a higher routing efficiency compared to traditional reinforcement-learning approaches.

Keywords

6G networks, SPAH-GCN-LSTM, Multi-agent RL, Deep Q-network, Multi-step experience replay strategy

Archives