Graphical Models for the Internet

Graphical Models for the Internet by Alexander Smola - Machine Learning Summer School at Purdue, 2011. Information extraction from web pages, social networks, news and user interactions crucially relies on inferring the hidden parameters of interaction between entities. For instance, in factorization models for movie recommendation we are interested in the underlying hidden properties of users and movies respectively such as to suggest new movies. Likewise, when extracting topics from web pages we want to find the hidden topics representing documents and words. Finally, when modeling user behavior it is worthwhile finding the latent factors, cluster variables, causes, etc. that drive a user's interaction with websites.

All these problems can be described in a coherent statistical framework. While much has been published about how to deal with these problems at moderate sizes, there is little information available on how to perform efficient scalable estimation at the scale of the internet. In this tutorial we present both the theory and algorithms for achieving these goals. In particular, we will describe inference algorithms for collaborative filtering, recommendation, latent Dirichlet allocation, and advanced clustering models. The course will cover basic issues of inference with graphical models and give a self-contained tutorial.

Lecture 1 - Systems: Hardware, Storage and processing, Communication and synchronization
Lecture 2 - Communication and synchronization, Applications on the Internet, Probabilistic modeling
Lecture 3 - Probabilistic modeling: Naive Bayes, Density estimation
Lecture 4 - Directed graphical models
Lecture 5 - Directed graphical models: Clustering, Expectation Maximization, Sampling
Lecture 6 - Scalable topic models, Advanced modeling
Lecture 7 - Advanced modeling (cont.)
Lecture 8 - Undirected graphical models

Machine Learning Summer School at Purdue, 2011
A Machine Learning Approach for Complex Information Retrieval Applications
A Short Course on Reinforcement Learning
Classic and Modern Data Clustering
Divide and Recombine for the Analysis of Big Data
Graphical Models for the Internet
Introduction to Machine Learning
Large-Scale Machine Learning and Stochastic Algorithms
Machine Learning for a Rainy Day
Machine Learning for Discovery in Legal Cases
Machine Learning for Statistical Genetics
Mining Heterogeneous Information Networks
Modeling Complex Social Networks
Optimization for Machine Learning
Privacy Issues with Machine Learning: Fears, Facts, and Opportunities
Survey of Boosting from an Optimization Perspective
The MASH Project