Modeling Unlearning and Relearning with Multi-agent Q-Learning Systems

doi:10.5121/csit.2024.141912

Modeling Unlearning and Relearning with Multi-agent Q-Learning Systems

Authors

Maryam Solaiman¹, Theodore Mui¹, Qi Wang² and Phil Mui³, ¹Aspiring Scholars Directed Research Program Fremont, USA, ²University of Texas Austin, Texas, ³Salesforce San Francisco, USA

Abstract

We model unlearning by simulating a Q-agent (using the reinforcement learning Q-learning algorithm), representing a real-world learner, playing the game of Nim against different adversarial agents to learn the optimal Nim strategy. When the Q-agent plays against sub-optimal agents, its percentage of optimal moves is decreased, analogous to a person forgetting ("unlearning") what they have learned previously. To mitigate the effect of this "unlearning", we experimented with modulating the Q-learning so that minimal learning occurs with untrusted opponents. This trust-based modulation is modeled by observing opponent moves that are different from those that a Q-agent has learned. This model parallels human trust which tends to increase with those whom one agrees with. With this modulated learning, we observe that a Q-agent with a baseline optimal strategy is able to robustly retain previously learned strategy, in some cases achieving a 0.3 difference in accuracy from the unlearning model. We then ran a three-phase simulation where the Q-agent played against optimal agents in the first phase, sub-optimal agents in the second "unlearning" phase, and optimal or random agents in the third phase. We found that even after unlearning, the Q-agent was quickly able to relearn most of its knowledge about the optimal strategy for Nim.

Keywords

Reinforcement learning, Q-learning, Nim Game, Unlearning, Learned Memory, Misinformation

AIRCC

Modeling Unlearning and Relearning with Multi-agent Q-Learning Systems

Authors

Abstract

Keywords