Important Sampling

https://www.coursera.org/learn/sample-based-learning-methods/lecture/6PRvh/course-introduction

Monte Carlo

  • Repeated random sampling

  • RL: estimate directly from experiences

  • DP

    • Agent knows the transition probabilities

  • Monte Carlo: Estimate values by averaging over a large number of random samples

Epsilon-soft policies

  • Continuously explore

  • Non-zero probability to each action in every state

  • Always stochastic (probability)

Last updated