Arbitrary trade-offs between cost and probability to goal and heuristic search algorithm in probabilistic planning under the GUBS criterion

  • Gabriel Nunes Crispino USP
  • Valdinei Freire USP
  • Karina Valdivia Delgado USP

Abstract


Stochastic Shortest Path Markov Decision Processes (SSP-MDPs) are used to model probabilistic sequential decision problems where the objective is to minimize the expected accumulated cost to goal. However, in the presence of dead-ends, this criterion can become ill-defined. GUBS, a criterion based on Expected Utility Theory that makes trade-offs between costs and probability-to-goal, was proposed to address these problems. The dissertation contains a comparison between GUBS and other related criteria for solving SSP-MDPs in the presence of dead-end states, and introduces theoretical definitions and results that show that GUBS, unlike these other criteria, allows for a trade-off between accumulated costs and probability-to-goal, and also guarantees arbitrary trade-offs that can be tuned from its parameters without previous knowledge of the problem. Also, this dissertation introduces eGUBS-AO*, an optimal heuristic search algorithm for solving the eGUBS criterion. As subproducts of this masters dissertation, α-MCMP, a new criterion, and UCT-GUBS, an algorithm for solving SSP-MDPs under the GUBS criterion, were also proposed.

References

Bellman, R., Corporation, R., and Collection, K. M. R. (1957). Dynamic Programming. Rand Corporation research study. Princeton University Press.

Bertsekas, D. (1995). Dynamic Programming and Optimal Control. Athena Scientific, Belmont, Mass.

Crispino, G., Freire, V., and Delgado, K. V. (2020). Algoritmo de Monte Carlo Tree Search para SSPs sob o critério GUBS. In Anais do XVII Encontro Nacional de Inteligência Artificial e Computacional, pages 402–413, Porto Alegre, RS, Brasil. SBC.

Crispino, G. N., Freire, V., and Delgado, K. V. (2023). α-MCMP: Trade-offs between probability and cost in SSPs with the MCMP criterion. In Brazilian Conference on Intelligent Systems, pages 112–127. Springer.

Freire, V. and Delgado, K. V. (2017). GUBS: a utility-based semantic for goal-directed Markov decision processes. In Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems, pages 741–749.

Freire, V., Delgado, K. V., and Reis, W. A. S. (2019). An exact algorithm to make a trade-off between cost and probability in SSPs. In Proceedings of the International Conference on Automated Planning and Scheduling, volume 29, pages 146–154.

Kolobov, A., Weld, D., et al. (2012). A theory of goal-oriented MDPs with dead ends. In Uncertainty in artificial intelligence : proceedings of the Twenty-eighth Conference [on uncertainty in artificial intelligence] (2012), pages 438–447.

Kolobov, A., Weld, D. S., and Geffner, H. (2011). Heuristic search for generalized stochastic shortest path MDPs. In Proceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling, pages 130–137.

Little, I., Thiebaux, S., et al. (2007). Probabilistic planning vs. replanning. In ICAPS Workshop on IPC: Past, Present and Future, pages 1–10.

Patek, S. D. (2001). On terminating Markov decision processes with a risk-averse objective function. Automatica, 37(9):1379–1386.

Sanner, S. and Yoon, S. (2011). IPPC results presentation. In International Conference on Automated Planning and Scheduling. Acessado em março de 2024.

Teichteil-Königsbuch, F. (2012). Stochastic safest and shortest path problems. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pages 1825–1831.

Teichteil-Königsbuch, F., Vidal, V., and Infantes, G. (2011). Extending classical planning heuristics to probabilistic planning with dead-ends. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, pages 1017–1022.

Trevizan, F. W., Teichteil-Königsbuch, F., and Thiébaux, S. (2017). Efficient solutions for stochastic shortest path problems with dead ends. In Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (UAI) (2017), pages 1–10.
Published
2024-07-21
CRISPINO, Gabriel Nunes; FREIRE, Valdinei; DELGADO, Karina Valdivia. Arbitrary trade-offs between cost and probability to goal and heuristic search algorithm in probabilistic planning under the GUBS criterion. In: THESIS AND DISSERTATION CONTEST (CTD), 37. , 2024, Brasília/DF. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 88-97. ISSN 2763-8820. DOI: https://doi.org/10.5753/ctd.2024.2211.