Deep Learning (Prof. Mitesh M. Khapra, IIT Ropar): Lecture 09 - Unsupervised Pre-training, Better Activation Functions, Batch Normalization

Deep Learning

Deep Learning. Instructors: Prof. Mitesh M. Khapra and Prof. Sudarshan Iyengar, Department of Computer Science and Engineering, IIT Ropar. Deep Learning has received a lot of attention over the past few years and has been employed successfully by companies like Google, Microsoft, IBM, Facebook, Twitter etc. to solve a wide range of problems in Computer Vision and Natural Language Processing. In this course we will learn about the building blocks used in these Deep Learning based solutions. Specifically, we will learn about feedforward neural networks, convolutional neural networks, recurrent neural networks and attention mechanisms. We will also look at various optimization algorithms such as Gradient Descent, Nesterov Accelerated Gradient Descent, Adam, AdaGrad and RMSProp which are used for training such deep neural networks. At the end of this course students will have knowledge of deep architectures used for solving various Vision and NLP tasks. (from nptel.ac.in)

Lecture 09.1 - A Quick Recap of Training Deep Neural Networks

Go to the Course Home or watch other lectures:

Lecture 01 - History of Deep Learning, Deep Learning Success Stories

Lecture 02 - McCulloch Pitts Neuron, Thresholding Logic, Perceptrons, Perceptron Learning Algorithm

Lecture 03 - Sigmoid Neurons, Gradient Descent

Lecture 04 - Feedforward Neural Networks, Backpropagation

Lecture 05 - Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD

Lecture 06 - Principle Component Analysis and its Interpretations, Singular Value Decomposition

Lecture 07 - Autoencoders and relation to PCA, Regularization in Autoencoders, Denoising Autoencoders

Lecture 08 - Regularization: Bias Variance Tradeoff, L2 Regularization, Early Stopping

Lecture 09 - Unsupervised Pre-training, Better Activation Functions, Batch Normalization

Lecture 10 - Learning Vectorial Representations of Words

Lecture 11 - Convolutional Neural Networks

Lecture 12 - Visualizing Patches which Maximally Activate a Neuron

Lecture 13 - Recurrent Neural Networks, Backpropagation through Time

Lecture 14 - Long Short Term Memory (LSTM) and Gated Recurrent Units (GRUs)

Lecture 15 - Encoder Decoder Models, Attention Mechanism, Attention over Images