Survey of Boosting from an Optimization Perspective

Survey of Boosting from an Optimization Perspective by Manfred K. Warmuth - Machine Learning Summer School at Purdue, 2011. Boosting has become a well known ensemble method. The algorithm maintains a distribution on the binary labeled examples and a new base learner is added in a greedy fashion. The goal is to obtain a small linear combination of base learners that clearly separates the examples. We focus on a recent view of Boosting where the update algorithm for distribution on the examples is characterized by a minimization problem that uses a relative entropy as a regularization.

The most well known boosting algorithms is AdaBoost. This algorithm approximately maximizes the hard margin, when the data is separable. We focus on recent algorithms that provably maximize the soft margin when the data is noisy. We will teach the new algorithms, give a unified and versatile view of Boosting in terms of relative entropy regularization, and show how to solve large scale problems based on state of the art optimization techniques. We also discuss lower and upper bounds on the number of iterations required for any greedy boosting method and propose a way to circumvent these lower bounds.

Lecture 1 - Introduction to Boosting
Lecture 2 - Squared Euclidean versus relative entropy regularization
Lecture 3 - Boosting as margin maximization with no regularization, LPBoost
Lecture 4 - LPBoost
Lecture 5 - Lower Bound and experiments
Lecture 6 - The Blessing and the Curse of the Multiplicative updates

Machine Learning Summer School at Purdue, 2011
A Machine Learning Approach for Complex Information Retrieval Applications
A Short Course on Reinforcement Learning
Classic and Modern Data Clustering
Divide and Recombine for the Analysis of Big Data
Graphical Models for the Internet
Introduction to Machine Learning
Large-Scale Machine Learning and Stochastic Algorithms
Machine Learning for a Rainy Day
Machine Learning for Discovery in Legal Cases
Machine Learning for Statistical Genetics
Mining Heterogeneous Information Networks
Modeling Complex Social Networks
Optimization for Machine Learning
Privacy Issues with Machine Learning: Fears, Facts, and Opportunities
Survey of Boosting from an Optimization Perspective
The MASH Project