Regularized Fitted Q-iteration: Application to Bounded Resource Planning
Amir massoud Farahmand, Mohammad Ghavamzadeh , Csaba Szepesvari and Shie Mannor.
| What | Talk |
|---|---|
| When |
2008-06-30 10:00
2008-06-30 10:25
2008-06-30 from 10:00 to 10:25 |
| Add event to calendar |
|
We consider bounded resource planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment and a limit on the computational resources. We propose to use fitted Q-iteration algorithm with penalized least-squares regression as the regression subroutine to address the problem of selecting an appropriate function approximator in each iteration. The algorithm is presented in detail for the case when the function space is a reproducing-kernel Hilbert space underlying a user chosen kernel-function. We derive bounds on the quality of solutions found and argue how data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure




