Date | Schedule of Topics | Links + Notes |
9/4/2013 | Introduction Learning theory discussion Prediction with expert advice The halving algorithm Example: halving on permutations for “predictive sorting” | The Multiplicative Weights Update Method by Arora, Hazan and Kale Notes from Peter Bartlett’s Course |
9/9/2013 | Review of halving algorithm on permutations The weighted majority algorithm Proving a mistake bound on Weighted Majority | Scribe notes posted from Lecture 1 An open problem I posed regarding “learning rankings” from COLT 2010 Apparently a solution to this open problem by Elad Hazan, Satyen Kale, and Shai Shalev-Shwartz Homework to be released soon! It will be due 9/25/2013. |
9/11/2013 | WMA generalized: Exponential Weights Algorithm The “Hedge Setting” | Homework 1 is available, it’s due 9/25/2013. Notes from Yoav Freund’s course |
9/16/2013 | More the the action setting vs prediction setting A new “additive” bound Variants of weighted majority (different priors, fixed share forecaster, changing learning rates) Hannan-consistency and the no-regret property | Orig “Hedge Setting” introduced by Freund and Schapire’s original Adaboost paper I showed an “additive bound” and a “multiplicative bound” in class on 9/16. These can be found in the PLG book, Theorems 2.2 and 2.4. I needed an inequality in class (regarding log E[exp(sX)]) which I didn’t prove, but it’s pretty classical and you can find the result in Lemma A.1 in the PLG book. |
9/18/2013 | Fixed-share forecaster | This result is actually reasonably well-written up in the PLG book, section 5.2. Theorem 5.2 proves the bound on the hyperexperts. Theorem 5.1 makes the connection between the complex algorithm making hyperexpert weight updates, and the efficient version. |
9/23/2013 | Lower bounds for 2 and N experts Game Theory I Two-player games | The lower bound I proved in class, with the important details, can roughly be found in PLG section 3.7. The key lemmas are in Appendix A.1.6. |
9/25/2013 | Game Theory II Nash’s Theorem via Brouwer Fixed-point Theorem Guest Lecturer: Satyen Kale Zero-sum games von Neumann Minimax Theorem | Here are the notes proving Nash’s Theorem, which includes the proof of Sperner’s Lemma that can be used to establish the Brouwer Fixed-point Theorem. |
9/30/2013 | Game Theory III Proof of Minimax Thm using Hedge Alg Computing approximate equilibria | More on this trick can be found in the Arora/Hazan/Kale survey, where I first saw this technique. There’s a more general version proven in PLG section 7.2. |
10/2/2013 | Introduction to Boosting Boost by Majority Boosting as sequential duality optimization | There a new Boosting book out by Rob Schapire and Yoav Freund which is excellent. The relationship between online learning, the minimax theorem, and boosting, is in Chapter 6. |
10/7/2013 | More on Boosting The Perceptron Algorithm | There are a number of good writeups of the perceptron regret bound. Here is one. |
10/9/2013 | Recap for perceptron algorithm Log loss and gambling Constant Rebalanced Portfolios Universal portfolio strategies Cover’s UCRP algorithm and analysis | Tom Cover’s original paper on Universal Portfolios The source of the easier analysis of the UCRP algorithm, with discussions of transaction costs |
10/14/2013 | FALL BREAK - NO CLASS!!!!! | |
10/16/2013 | Guest Lecturer: Ambuj Tewari Introduction to Online Convex Optimization Be The Leader etc. | |
10/21/2013 | Review of Universal Portfolios Review of Online Convex Optimization model Examples applications of OCO Online Gradient Descent algorithm | Original OCO + gradient descent paper by Zinkevich Several good survey papers on OCO now, see links on course website |
10/23/2013 | Regret bound for Online Gradient Descent Guest Lecturer: Rafael Frongillo Game-theoretic Probability in Finance | Lecture based on ideas from this book |
10/28/2013 | More on algorithms for OCO Follow the Leader Application: Online Density Estimation | Earlly work by Azoury and Warmuth on density estimation General results for logarithmic regret algorithms by Hazan, Agarwal, and Kale |
10/30/2013 | Follow the Regularized Leader (FTRL) Generic regret bound for FTRL Guest visit from CRLT by Tershia | |
11/4/2013 | Online Convex Optimization III OCO Application 1: Convex Optimization OCO Application 2: risk bound in IID setting | The “online to batch conversion” idea started with this nips paper from 2004 |
11/6/2013 | Student HW2 presentations The multi-armed bandit setting Greedy algorithm | Nearly all results from the next couple of lectures can be found in Bubeck and Cesa-Bianchi’s survey of regret analysis for bandit problems |
11/11/2013 | The UCB Algorithm The adversarial bandit setting Definition of EXP3 Algorithm | Peter Auer was the first to propose the UCB algorithm for a finite-time analysis of the multi-armed bandit setting |
11/13/2013 | EXP3 Regret Bound and Proof Online linear optimization in the bandit setting | Bandit slides from class presentation |
11/18/2013 | Bandit OLO and the use of self-concordant regularization Introduction to Blackwell Approachability | My 2011 COLT paper does a reasonably good job describing the Approachability setting |
11/20/2013 | Blackwell Approachability Algorithm and Convergence Reductions between Approachability and No-Regret | The COLT paper describes the reduction using cone duality, but there has been some recent work that showed how to avoid the use of cones and instead uses an OCO reduction. But the paper doesn’t appear to be available yet, I’ll contact the authors for more. |
11/25/2013 | OFFICE HOUR - NO CLASS | |
11/27/2013 | PRE-THANKSGIVING - NO CLASS | |
12/2/2013 | Approachability, Calibrated Forecasting, and No Internal Regret | |
12/4/2013 | Reductions between learning problems Correlated Equilibria | |
12/9/2013 | STUDENT PRESENTATIONS | |
12/11/2013 | POSTER SESSION |