Distilling Gaming Strategy through Explainability in Tetris
Abstract
As artificial intelligence (AI) systems become increasingly integrated into game design, the demand for transparent and adaptive decision-making grows. While Explainable AI (XAI) has illuminated the internal reasoning of AI agents, most explanation-based training methods have traditionally prioritized alignment with a teacher model over the exploration of strategic diversity. In this paper, we introduce a novel framework that leverages explanation-based knowledge distillation to modulate agents’ internal reasoning, yielding both convergent and divergent behavioral strategies. To demonstrate this approach, we conducted experiments in a Tetris environment comparing baseline agents trained with standard reinforcement learning to agents whose training was modified by incorporating explainability losses. Our dynamic framework integrates a feedback mechanism that adjusts the influence of the explainability term based on performance and strategic utility. This work demonstrates the potential of employing explainability not only as an interpretative tool but also as a means to actively diversify and refine strategies in complex, dynamic environments.
References
Mohamed Ashry. 2020. Applying Deep Q-Networks (DQN) to the game of Tetris using high-level state spaces and different reward functions. Ph. D. Dissertation. DOI: 10.13140/RG.2.2.27533.56807
Sualeh Asif, Michael J. Coulombe, Erik D. Demaine, Martin L. Demaine, Adam Hesterberg, Jayson Lynch, and Mihir Singhal. 2020. Tetris is NP-hard even with $O(1)$ rows or columns. CoRR abs/2009.14336 (2020). arXiv:2009.14336 [link]
Mahsa Bazzaz and Seth Cooper. 2024. Guided game level repair via explainable AI. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 20. 139–148.
Y. Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curricu lum learning. Journal of the American Podiatry Association 60, 6. DOI: 10.1145/1553374.1553380
Riccardo Cantini, Alessio Orsino, and Domenico Talia. 2024. Xai-driven knowledge distillation of large language models for efficient deployment on low-resource devices. Journal of Big Data 11 (05 2024). DOI: 10.1186/s40537-024-00928-3
Fernando de Mesentier Silva, Igor Borovikov, John Kolen, Navid Aghdaie, and Kazi A. Zaman. 2018. Exploring Gameplay With AI Agents. CoRR abs/1811.06962 (2018). arXiv:1811.06962 [link]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, and Changjie Fan. 2020. Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping. CoRR abs/2011.02669 (2020). arXiv:2011.02669 [link]
Scott Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. arXiv:1705.07874 [cs.AI] [link]
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. CoRR abs/1602.04938 (2016). arXiv:1602.04938 [link]
Chathuranga Senanayake. 2025. DYNAMIC NPC AI USING REINFORCEMENT LEARNING FOR AN ENHANCED GAMING EXPERIENCE. DOI: 10.5281/zenodo.15024259
Tianli Sun, Haonan Chen, Guosheng Hu, and Cairong Zhao. 2025. Explainability-based knowledge distillation. Pattern Recognition 159 (2025), 111095. DOI: 10.1016/j.patcog.2024.111095