6.7960 Deep Learning (Fall 2024, MIT OCW): Lecture 08

6.7960 Deep Learning

6.7960 - Deep Learning (Fall 2024, MIT OCW). Instructors: Prof. Phillip Isola, Prof. Sara Beery, and Dr. Jeremy Bernstein. This course covers the fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high dimensions, and applications to computer vision, natural language processing, and robotics. (from ocw.mit.edu)

Lecture 08 - Architectures: Transformers

This video introduces transformers, focusing on three key ideas: tokens, attention, and positional codes. It also explores how transformers relate to MLPs, GNNs, and CNNs as variations on common principles.

Go to the Course Home or watch other lectures:

Lecture 01 - Introduction to Deep Learning

Lecture 02 - How to Train a Neural Net

Lecture 03 - Approximation Theory

Lecture 04 - Architectures: Grids

Lecture 05 - Architectures: Graphs

Lecture 06 - Generalization Theory

Lecture 07 - Scaling Rules for Optimization

Lecture 08 - Architectures: Transformers

Lecture 09 - Hacker's Guide to Deep Learning

Lecture 10 - Architectures: Memory

Lecture 11 - Representation Learning: Reconstruction-Based