Value Iteration in POMDPs

Value function of policy p

Bellman equation for optimal value function

Value iteration: recursively estimating value function

Greedy policy:

Previous slide Next slide Back to first slide View graphic version