Regret Minimisation and System-Efficiency in Route Choice

Gabriel O. Ramos; Ana L. C. Bazzan; Bruno C. da Silva

doi:10.5753/ctd.2019.6332

Gabriel O. Ramos UFRGS
Ana L. C. Bazzan UFRGS
Bruno C. da Silva UFRGS

DOI: https://doi.org/10.5753/ctd.2019.6332

Resumo

Traffic congestions present a major challenge in large cities. Consid- ering the distributed, self-interested nature oftraffic we tackle congestions using multiagent reinforcement learning (MARL). In this thesis, we advance the state- of-the-art by delivering the first MARL convergence guarantees in congestion- like problems. We introduce an algorithm through which drivers can learn opti- mal routes by locally estimating the regret associated with their decisions, which we prove to converge to an equilibrium. In order to mitigate the effects ofselfish- ness, we also devise a decentralised tolling scheme, which we prove to minimise traffic congestion levels. Our theoretical results are supported by an extensive empirical evaluation on realistic traffic networks. 1.

Palavras-chave: Aprendizagem por reforço multiagente, escolha de rotas, equilíbrio dos usuários, ótimo do sistema, minimização de regret, regret da ação, informação de viagem, pedágio de custo marginal.

Referências

CEBR, Centre for Economics and Business Research. (2014). The future economic and environmental costs of gridlock in 2030. Technical report, CEBR, London.

Cintra, M. (2014). Os custos dos congestionamentos na cidade de são paulo. Technical report, Fundacao Getulio Vargas, Sao Paulo.

Grunitzki, R., Ramos, G. d. O., and Bazzan, A. L. C. (2014). Uma ferramenta para alocacao de trafego e aprendizagem de rotas em redes viarias. In Anais do XXVIII Congresso de Pequisa e Ensino em Transportes (ANPET 2014).

Koutsoupias, E. and Papadimitriou, C. (1999). Worst-case equilibria. In Annual Sympo- sium on Theoretical Aspects ofComputer Science, pages 404–413. Springer.

Lee, M., Barbosa, H., Youn, H., Holme, P., and Ghoshal, G. (2017). Morphology of travel routes and the organization of cities. Nature communications, 8(1):2229.

Ramos, G. de. O. (2017). Minimising regret in route choice (doctoral consortium). In Proc. ofthe 16th International Conference on Autonomous Agents and Multiagent Sys- tems (AAMAS 2017), pages 1855–1856, Sao Paulo. IFAAMAS.

Ramos, G. de. O. (2018). Regret Minimisation and System-Efficiency in Route Choice. PhD thesis, Universidade Federal do Rio Grande do Sul, Porto Alegre.

Ramos, G. de. O. and Bazzan, A. L. C. (2015). Towards the user equilibrium in traffic assignment using GRASP with path relinking. In Proceedings ofthe 2015 on Genetic and Evolutionary Computation Conference, GECCO ’15, pages 473–480. ACM.

Ramos, G. de. O. and Bazzan, A. L. C. (2016a). Efficient local search in traffic assign- ment. In Congress on Evolutionary Computation (CEC), pages 1493–1500. IEEE.

Ramos, G. de. O. and Bazzan, A. L. C. (2016b). On estimating action regret and learning from it in route choice. In Proceedings ofthe Ninth Workshop on Agents in Traffic and Transportation (ATT-2016), pages 1–8, New York. CEUR-WS.org.

Ramos, G. de. O., Bazzan, A. L. C., and da Silva, B. C. (2018a). Analysing the impact of travel information for minimising the regret of route choice. Transportation Research Part C: Emerging Technologies, 88:257–271.

Ramos, G. de. O., da Silva, B. C., and Bazzan, A. L. C. (2017a). Learning to minimise regret in route choice. In Proc. of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017), pages 846–855. IFAAMAS.

Ramos, G. de. O., da Silva, B. C., R˘adulescu, R., and Bazzan, A. L. C. (2018b). Learning system-efficient equilibria in route choice using tolls. In Proceedings of the Adaptive Learning Agents Workshop 2018 (ALA-18), Stockholm.

Ramos, G. de. O., da Silva, B. C., R˘adulescu, R., Bazzan, A. L. C., and Now´e, A. (2019a). Toll-based reinforcement learning for efficient equilibria in route choice. Knowledge Engineering Review. Conditionally accepted.

Ramos, G. de. O., Lemos, L. L., and Bazzan, A. L. C. (2017b). Developing a python reinforcement learning library for traffic simulation. In Proceedings of the Adaptive Learning Agents Workshop 2017 (ALA2017), ALA2017, S˜ao Paulo.

Ramos, G. de. O., R˘adulescu, R., and Now´e, A. (2019b). A budged-balanced tolling scheme for efficient equilibria under heterogeneous preferences. In Proceedings ofthe Adaptive Learning Agents Workshop 2019 (ALA-19), Montreal.

Tuyls, K. and Weiss, G. (2012). Multiagent learning: Basics, challenges, and prospects. AI Magazine, 33(3):41–52.

Wardrop, J. G. (1952). Some theoretical aspects of road traffic research. Proceedings of the Institution ofCivil Engineers, Part II, 1(36):325–362.

Youn, H., Gastner, M. T., and Jeong, H. (2008). Price of anarchy in transportation net- works: efficiency and optimality control. Physical review letters, 101(12):128701.