Reinforcement Learning. Instructor: Prof. Balaraman Ravindran, Department of Computer Science and Engineering, IIT Madras. Reinforcement learning is a paradigm that aims to model the trial-and-error learning process that is needed in many problem situations where explicit instructive signals are not available. It has roots in operations research, behavioral psychology and AI. The goal of the course is to introduce the basic mathematical foundations of reinforcement learning, as well as highlight some of the recent directions of research. (from nptel.ac.in)
|Lecture 20 - Returns, Value Functions and Markov Decision Processes (MDPs)|
We start this lesson by looking at the concept of return, the quantity that we typically try to optimise in solving RL problems. We will look at different ways of formulating returns and discuss the strengths and weaknesses of these formulations. We then discuss value functions which we have already come across in previous lessons. We finish with a discussion on Markov Decision Processes (MDPs) and their importance in RL.
Go to the Course Home or watch other lectures: