Skip to content

Personal tools
Document Actions

Regularized Fitted Q-iteration: Application to Bounded Resource Planning

Amir massoud Farahmand, Mohammad Ghavamzadeh , Csaba Szepesvari and Shie Mannor.

What Talk
When 2008-06-30
from 10:00 to 10:25
Add event to calendar vCal
iCal

We consider bounded resource planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment and a limit on the computational resources. We propose to use fitted Q-iteration algorithm with penalized least-squares regression as the regression subroutine to address the problem of selecting an appropriate function approximator in each iteration. The algorithm is presented in detail for the case when the function space is a reproducing-kernel Hilbert space underlying a user chosen kernel-function. We derive bounds on the quality of solutions found and argue how data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure


This site conforms to the following standards: WCAG Valid XHTML Valid CSS