InfoCoBuild

Large-Scale Machine Learning and Stochastic Algorithms

Large-Scale Machine Learning and Stochastic Algorithms by Leon Bottou - Machine Learning Summer School at Purdue, 2011. During the last decade, data sizes have outgrown processor speed. We are now frequently facing statistical machine learning problems for which datasets are virtually infinite. Computing time is then the bottleneck.

The first part of the lecture centers on the qualitative difference between small-scale and large-scale learning problem. Whereas small-scale learning problems are subject to the usual approximation - estimation tradeoff, large-scale learning problems are subject to a qualitatively different tradeoff involving the computational complexity of the underlying optimization algorithms in non-trivial ways. Unlikely optimization algorithm such as stochastic gradient show amazing performance for large-scale machine learning problems. The second part makes a detailed overview of stochastic learning algorithms applied to both linear and nonlinear models. In particular I would like to spend time on the use of stochastic gradient for structured learning problems and on the subtle connection between nonconvex stochastic gradient and active learning.

Lecture 1 - Learning with Stochastic Gradient Descent
Lecture 2 - The Tradeoffs of Large Scale Learning
Lecture 3 - Experiments with SGD
Lecture 4 - Analysis for a Simple Case, Learning with a Single Epoch
Lecture 5 - General Convergence Results
Lecture 6 - SGD for Neyman-Pearson Classification


Machine Learning Summer School at Purdue, 2011
A Machine Learning Approach for Complex Information Retrieval Applications
A Short Course on Reinforcement Learning
Classic and Modern Data Clustering
Divide and Recombine for the Analysis of Big Data
Graphical Models for the Internet
Introduction to Machine Learning
Large-Scale Machine Learning and Stochastic Algorithms
Machine Learning for a Rainy Day
Machine Learning for Discovery in Legal Cases
Machine Learning for Statistical Genetics
Mining Heterogeneous Information Networks
Modeling Complex Social Networks
Optimization for Machine Learning
Privacy Issues with Machine Learning: Fears, Facts, and Opportunities
Survey of Boosting from an Optimization Perspective
The MASH Project