α-MCMP: Trade-Offs Between Probability and Cost in SSPs with the MCMP Criterion
Resumo
In Stochastic Shortest Path (SSP) problems, not always the requirement of having at least one policy with a probability of reaching goals (probability-to-goal) equal to 1 can be met. This is the case when dead ends, states from which the probability-to-goal is equal to 0, are unavoidable for any policy, which demands the definition of alternate methods to handle such cases. The α-strong probability-to-goal priority is a property that is maintained by a criterion if a necessary condition to optimality is that the ratio between the probability-to-goal values of the optimal policy and any other policy is bound by a value of 0 ≤ α ≤ 1. This definition is helpful when evaluating the preference of different criteria for SSPs with dead ends. The Min-Cost given Max-Prob (MCMP) criterion is a method that prefers policies that minimize a well-defined cost function in the presence of unavoidable dead ends given policies that maximize probability-to-goal. However, it only guarantees α-strong priority for α = 1. In this paper, we define α-MCMP, a criterion based on MCMP with the addition of the guarantee of α-strong priority for any value 0 ≤ α ≤ 1. We also perform experiments comparing α-MCMP and GUBS, the only other criteria known to have α-strong priority for 0 ≤ α ≤ 1, to analyze the difference between the probability-to-goal of policies generated by each criterion.