Darius Braziunas: Stochastic Local Search for POMDP Controllers (M.Sc. Thesis)

Stochastic Local Search for POMDP Controllers
M.Sc. thesis, 2003

Darius Braziunas
Department of Computer Science
University of Toronto
Toronto, ON M5S 3H5

Abstract

Gradient-based search in the space of policies representable as stochastic finite state controllers is one of the few tractable solution methods for non-trivial partially observable Markov decision processes (POMDPs). In this thesis, we illustrate a basic problem with standard gradient ascent applied to POMDPs, where the sequential nature of the decision problem is at issue, and propose a new stochastic local search method as an alternative. Our method employs certain heuristics that mimic some of the sequential reasoning inherent in dynamic programming approaches; while more computationally demanding, it can find good, even optimal, controllers where gradient-based methods commonly converge to poor local suboptima.

@MScThesis{Braziunas-MScThesis,
  author =  {Darius Braziunas},
  title =   {Stochastic Local Search for POMDP Controllers},
  school =  {University of Toronto},
  year =    {2003}
}

Download

To first page

Stochastic Local Search for POMDP ControllersM.Sc. thesis, 2003

Stochastic Local Search for POMDP Controllers
M.Sc. thesis, 2003