Stochastic Local Search for POMDP Controllers
M.Sc. thesis, 2003

Darius Braziunas
Department of Computer Science
University of Toronto
Toronto, ON M5S 3H5


Gradient-based search in the space of policies representable as stochastic finite state controllers is one of the few tractable solution methods for non-trivial partially observable Markov decision processes (POMDPs). In this thesis, we illustrate a basic problem with standard gradient ascent applied to POMDPs, where the sequential nature of the decision problem is at issue, and propose a new stochastic local search method as an alternative. Our method employs certain heuristics that mimic some of the sequential reasoning inherent in dynamic programming approaches; while more computationally demanding, it can find good, even optimal, controllers where gradient-based methods commonly converge to poor local suboptima.

  author =  {Darius Braziunas},
  title =   {Stochastic Local Search for POMDP Controllers},
  school =  {University of Toronto},
  year =    {2003}


To first page