Value Iteration

Value iteration is based on approximating an infinite horizon problem by a finite horizon problem.

Value Iteration: Finite Horizon Case

Problem

max π0 max π1 . . . max πT−1 E [r0 + r1 + · · · + rT−1 + VT (sT )]

References

  1. http://www3.ul.ie/ramsey/Lectures/Operations_Research_2/or2slides3.pdf