A Monte Carlo Algorithm for Time-Constrained General Game Playing
Resumo
General Game Playing (GGP) is a challenging domain for AI agents, as it requires them to play diverse games without prior knowledge. In this paper, we develop a strategy to improve move suggestions in time-constrained GGP settings. This strategy consists of a hybrid version of UCT that combines Sequential Halving and UCB, favoring information acquisition in the root node, rather than overspend time on the most rewarding actions. Empirical evaluation using a GGP competition scheme from the Ludii framework shows that our strategy improves the average payoff over the entire competition set of games. Moreover, our agent makes better use of extended time budgets, when available.