Verified Algorithms for Solving Markov Decision Processes

 

Title: Verified Algorithms for Solving Markov Decision Processes
Authors: Maximilian Schäffeler (schaeffm /at/ in /dot/ tum /dot/ de) and Mohammad Abdulaziz
Submission date: 2021-12-16
Abstract: We present a formalization of algorithms for solving Markov Decision Processes (MDPs) with formal guarantees on the optimality of their solutions. In particular we build on our analysis of the Bellman operator for discounted infinite horizon MDPs. From the iterator rule on the Bellman operator we directly derive executable value iteration and policy iteration algorithms to iteratively solve finite MDPs. We also prove correct optimized versions of value iteration that use matrix splittings to improve the convergence rate. In particular, we formally verify Gauss-Seidel value iteration and modified policy iteration. The algorithms are evaluated on two standard examples from the literature, namely, inventory management and gridworld. Our formalization covers most of chapter 6 in Puterman's book "Markov Decision Processes: Discrete Stochastic Dynamic Programming".
BibTeX:
@article{MDP-Algorithms-AFP,
  author  = {Maximilian Schäffeler and Mohammad Abdulaziz},
  title   = {Verified Algorithms for Solving Markov Decision Processes},
  journal = {Archive of Formal Proofs},
  month   = dec,
  year    = 2021,
  note    = {\url{https://isa-afp.org/entries/MDP-Algorithms.html},
            Formal proof development},
  ISSN    = {2150-914x},
}
License: BSD License
Depends on: Gauss_Jordan, MDP-Rewards
Status: [ok] This is a development version of this entry. It might change over time and is not stable. Please refer to release versions for citations.