Enhancing Designer Knowledge to Dialogue Management: A Comparison between Supervised and Reinforcement Learning Approaches

  • Bruno Eidi Nishimoto USP / Itaú Unibanco
  • Rogers Silva Cristo Itaú Unibanco
  • Alex Fernandes Mansano Itaú Unibanco
  • Eduardo Raul Hruschka USP
  • Vinicius Fernandes Caridá Itaú Unibanco
  • Anna Helena Reali Costa USP


Task-oriented dialogue systems are complex natural language applications employed in various fields such as health care, sales assistance, and digital customer servicing. Although the literature suggests several approaches to managing this type of dialogue system, only a few of them compares the performance of different techniques. From this perspective, in this paper we present a comparison between supervised learning, using the transformer architecture, and reinforcement learning using two flavors of Deep Q-Learning (DQN) algorithms. Our experiments use the MultiWOZ dataset and a real-world digital customer service dataset, from which we show that integrating expert pre-defined rules with DQN allows outperforming supervised approaches. Additionally, we also propose a method to make better usage of the designer knowledge by improving how interactions collected in warm-up are used in training phase. Our results indicate a reduction in training time by preserving the designer’s knowledge, expressed as pre-defined rules in memory during the initial steps of the DQN training procedure.


NISHIMOTO, Bruno Eidi; CRISTO, Rogers Silva; MANSANO, Alex Fernandes; HRUSCHKA, Eduardo Raul; CARIDÁ, Vinicius Fernandes; COSTA, Anna Helena Reali. Enhancing Designer Knowledge to Dialogue Management: A Comparison between Supervised and Reinforcement Learning Approaches. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 19. , 2022, Campinas/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 364-376. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2022.227625.

