Miguel Angel Medina-Ramarez, Cayetano Guerra-Artal and Mario Hernaindez-Tejera, University of Las Palmas de Gran Canarias, Spain
Task-oriented dialogue systems have become crucial for users to interact with machines and computers using natural language. One of its key components is the dialogue manager, which guides the conversation towards a good goal for the user by providing the best possible response. Previous works have proposed rule-based systems, reinforcement learning, and supervised learning as solutions for correct dialogue management; in other words, select the best response given input by the user. This work explores the impact of dataset quality on the performance of dialogue managers. We delve into potential errors in popular datasets, such as Multiwoz 2.1 and SGD. For our investigation, we developed a synthetic dialogue generator to regulate the type and magnitude of errors introduced. Our findings suggest that dataset inaccuracies, like mislabeling, might play a significant role in the challenges faced in dialogue management.
Dialog Systems, dialogue management, dataset quality, supervised learning