Value Iteration in POMDPs
Value function of policy p
Bellman equation for optimal value function
Value iteration: recursively estimating value function
Greedy policy:
Substitute b for s
Previous slide
Next slide
Back to first slide
View graphic version