Algorithms for Big Data

Algorithms for Big Data. Instructor: Prof. John Augustine, Department of Computer Science and Engineering, IIT Madras. In this course, you will learn how to design and analyse algorithms in the streaming and property testing models of computation. The algorithms will be analysed mathematically, so it is intended for a mathematically mature audience with prior knowledge of algorithm design and basic probability theory.

Traditional algorithms work well when the input data fits entirely within memory. In many modern application contexts, however, the size of the input data is too large to fit within memory. In some cases, data is stored in large data centres or clouds and specific parts of it can be accessed via queries. In some other application contexts, very large volume of data may stream through a computer one item at a time. So the algorithm will get to see the data typically as a single pass, but will not be able to store the data for future reference. In this course, we will introduce computational models, algorithms and analysis techniques aimed at addressing such big data contexts. (from

Lecture 42 - Property Testing and Random Walk Algorithms

Go to the Course Home or watch other lectures:

Lecture 01 - Basic Definitions: Basics of Probability Theory
Lecture 02 - Conditional Probability
Lecture 03 - Examples - How to Use Probability to Solve Problems
Lecture 04 - Karger's Mincut Algorithm
Lecture 05 - Analysis of Karger's Mincut Algorithm
Lecture 06 - Random Variables
Lecture 07 - Randomized Quicksort
Lecture 08 - Problem Solving Example - The Rich Get Richer
Lecture 09 - Problem Solving Example - Monty Hall Problem
Lecture 10 - Bernoulli, Binomial, and Geometric Distributions
Lecture 11 - Tail Bounds
Lecture 12 - Application of the Chernoff Bound
Lecture 13 - Application of Chebyshev's inequality
Lecture 14 - Introduction to Big Data Algorithms
Lecture 15 - SAT Problem
Lecture 16 - Classification of States
Lecture 17 - Stationary Distribution of a Markov Chain
Lecture 18 - Celebrities Case Study
Lecture 19 - Random Walks on Undirected Graphs
Lecture 20 - Introduction to Streaming, Morris Algorithm
Lecture 21 - Reservoir Sampling
Lecture 22 - Approximate Median
Lecture 23 - Hashing and Pairwise Independence: Overview
Lecture 24 - Balls, Bins, Hashing
Lecture 25 - Chain Hashing, SUHA, Power of Two Choices
Lecture 26 - Bloom Filter
Lecture 27 - Pairwise Independence
Lecture 28 - Estimating Expectation of Continuous Function
Lecture 29 - Universal Hash Functions
Lecture 30 - Perfect Hashing
Lecture 31 - Count-Min Filter for Heavy Hitters in Data Streams
Lecture 32 - Problem Solving - Doubly Stochastic Transition Matrix
Lecture 33 - Problem Solving - Random Walks on Linear Structures
Lecture 34 - Problem Solving - Lollipop Graph
Lecture 35 - Problem Solving - Cat and Mouse
Lecture 36 - Estimating Frequency Moments
Lecture 37 - Property Testing Framework
Lecture 38 - Testing Connectivity
Lecture 39 - Property Testing: The Enforce and Test Technique
Lecture 40 - Testing if a Graph is a Biclique
Lecture 41 - Testing Bipartiteness
Lecture 42 - Property Testing and Random Walk Algorithms
Lecture 43 - Testing if a Graph is Bipartite (using Random Walks)
Lecture 44 - Graph Streaming Algorithms: Introduction
Lecture 45 - Graph Streaming Algorithms: Matching
Lecture 46 - Graph Streaming Algorithms: Graph Sparsification
Lecture 47 - MapReduce
Lecture 48 - K-Machine Model (aka Pregel Model)