Differentiable Planning with Indefinite Horizon

  • Daniel B. Dias Universidade de São Paulo
  • Leliane N. de Barros Universidade de São Paulo
  • Karina V. Delgado Universidade de São Paulo
  • Denis D. Mauá Universidade de São Paulo


With the recent advances in automated planning based on deep-learning techniques, Deep Reactive Policies (DRPs) have been shown as a powerful framework to solve Markov Decision Processes (MDPs) with a certain degree of complexity, like MDPs with continuous action-state spaces and exogenous events. Some differentiable planning algorithms can learn these policies through policy-gradient techniques considering a finite horizon MDP. However, for certain domains, we do not know the ideal size of the horizon needed to find an optimal solution, even when we have a planning goal description, that can either be a simple reachability goal or a complex goal involving path optimization. This work aims to solve a continuous MDP through differentiable planning, considering the problem horizon as a hyperparameter that can be adjusted for a DRP training process. This preliminary investigation show that it is possible to find better policies by choosing a horizon that encompasses the planning goal.

Palavras-chave: continuous state and action planning, Markov decision processes, machine learning, differentiable planning


