Reinforcement Learning. Instructor: Prof. Balaraman Ravindran, Department of Computer Science and Engineering, IIT Madras. Reinforcement learning is a paradigm that aims to model the trial-and-error learning process that is needed in many problem situations where explicit instructive signals are not available. It has roots in operations research, behavioral psychology and AI. The goal of the course is to introduce the basic mathematical foundations of reinforcement learning, as well as highlight some of the recent directions of research. (from nptel.ac.in)
|Lecture 16 - Policy Search|
In the last few lessons, we looked at different algorithms for solving bandit problems in which we maintained reward estimates for each arm. In this lesson, we start looking at a different approach to solving bandit problems, known as policy search, in which we try to identify the optimal policy by searching directly in the policy space.
Go to the Course Home or watch other lectures: