Machine Learning, Fall 2018

Updated automatically every 5 minutes

Machine Learning, Fall 2018 |

Videos |

Time/Location: Tue/Thu 2:00-3:15 pm in room CAS B12

Course Number: CS 542

Instructor: Kate Saenko, saenko@bu.edu; office hours: T/Th 3:30-5pm in MCS-296

Teaching Fellows: Fred Fung (fung@bu.edu) office hours: EMA 302 T 5-6PM, W 2:25PM to 3:25PM, Xingchao Peng (xpeng@bu.edu) office hours: EMA 302 M: 5-6 PM, Th:5-6PM

Graders: Sid Mysore, Ben Usman, Vitali Petsiuk

Piazza: piazza.com/bu/fall2018/cs542

Schedule*

Topic | Details | Assignments | |

THE BASICS | |||

Tue Sep 4 | Course Introduction | what is machine learning? types of learning; features; hypothesis; cost function; course information | |

Wed lab | Probability and Math Review | background knowledge on linear algebra and probability theory. Useful reference on matrix calculus; also see http://www.matrixcalculus.org/ | |

Thu Sep 6 | Supervised Learning I: Regression | regression, linear hypothesis, SSD cost; gradient descent; normal equations; maximum likelihood; Reading: Bishop 1.2-1.2.4,3.1-3.1.1 | ps0 out |

Tue Sep 11 | Supervised Learning II: Classification (guest lecture by Prof. Kulis) | classification; sigmoid function; logistic regression. Reading: 4.3.1-4.3.2; 4.3.4 | |

Wed lab | Multivariate Gaussian Review, Eigenvectors, ps0 | ||

Thu Sep 13 | Intro to Projects | project pitches from BUSpark! partners | ps0 due (11:55am Fri) ps1 out |

Tue Sep 18 | Supervised Learning III: Regularization | more logistic regression, regularization; bias-variance Reading: Bishop 3.2; 3.1.4 | |

Wed lab | ps0 and Numpy Tutorial | ||

Thu Sep 20 | Unsupervised Learning I: Clustering | clustering, k-means, Gaussian mixtures. Reading: Bishop 9.1-9.2 | ps1 due (11:55am Fri) ps2 out |

Tue Sep 25 | Unsupervised Learning II: PCA | dimensionality reduction, PCA. Reading: Bishop 12.1 | |

Wed lab | ps1 Solution & ps2 Hints | ||

NEURAL NETWORKS | |||

Thu Sep 27 | Neural Networks I: Feed-forward Nets | artificial neuron, MLP, sigmoid units; neuroscience inspiration; output vs hidden layers; linear vs nonlinear networks; feed-forward neural networks; Reading: Bishop Ch 5.1-5.3 | ps2 due (11:55am Fri) ps3 out project signup due (11:55am Fri) |

Tue Oct 2 | Neural Networks II: Learning | Learning via gradient descent; backpropagation algorithm. Reading: Bishop Ch 5.1-5.3 | teams assigned LINK |

Wed lab | TensorFlow Tutorial | Build a NN to classify iris. | |

Thu Oct 4 | Neural Networks III: Convolutional Nets | Convolutional networks. Reading: Bishop Ch 5.5 | ps3 due (11:55am Fri) ps4 out |

Tue Oct 9 | NO CLASS, LAST DAY TO DROP | ||

Wed lab | Convolutional Nets demo | ||

Thu Oct 11 | Neural Networks IV: Recurrent Nets | recurrent networks; training strategies | ps4 due (11:55am Fri) |

Tue Oct 16 | Computing cluster/Tensorflow Intro (guest lecture by Katia Oleinik) | Intro to SCC and Tensorflow; please bring laptops to class to follow along with the lecture and install software according to these instructions | |

Wed lab | Midterm Review | ||

Thu Oct 18 | Midterm | covers everything up to and including Neural Networks III; expect questions on material covered in lectures, problem sets, LABs and assigned reading | Midterm Practice Problems Solutions |

ADVANCED TOPICS | |||

Tue Oct 23 | Probabilistic Generative Models | generalized linear models; generative vs discriminative models; linear discriminant analysis; Reading: Bishop Ch 4.2 | |

Wed lab | |||

Thu Oct 25 | Bayesian Methods | priors over parameters; Bayesian linear regression; Reading: Bishop Ch 2.3 | ps5 out |

Tue Oct 30 | Support Vector Machines I | hinge loss, maximum margin method; support vector machines; Reading: Bishop Ch 7.1.1-7.1.2 | |

Wed lab | |||

Thu Nov 1 | Support Vector Machines II | Hinge loss vs. cross-entropy loss; primal SVM formulation; non-separable data; slack variables; | project proposal due (in class) ps5 due (11:55am Fri) ps6 out |

Tue Nov 6 | Support Vector Machines III | Dual formulation; kernels; multiclass SVM; Reading: Bishop Ch 6.1-6.2, Ch 7.1.3 | |

Wed lab | Evaluation Metrics for ML | ||

Thu Nov 8 | Unsupervised Learning III: Anomaly Detection | Density estimation for anomaly detection; evaluating anomaly detection | ps6 due (11:55am Fri) |

Tue Nov 13 | Unsupervised Learning IV: GANs | Implicit generative models; adversarial methods; Generative Adversarial Nets (GANs); Reading: Goodfellow et al. NIPS 2014 | |

Wed lab | project help | ||

Thu Nov 15 | Reinforcement Learning I | reinforcement learning; Markov Decision Process (MDP); policies, value functions, Q-learning | project update I due (in class); |

Tue Nov 20 | Reinforcement Learning II | Q-learning cont’d; deep Q-learning (DQN) | |

Wed lab | NO CLASS- THANKSGIVING | ||

Thu Nov 22 | NO CLASS- THANKSGIVING | ||

APPLICATIONS | |||

Tue Nov 27 | Domain Adaptation for Visual Data | domain shift; domain adaptation; adversarial feature alignment | |

Wed lab | project help | ||

Thu Nov 29 | Language and Vision Applications | Image captioning, video captioning, visual question answering | project updateII due(in class) Fri Nov 30 self-grading due |

Tue Dec 4 | Bias and Fairness in Machine Learning | Bias in machine learning, fairness, transparency, accountability; de-biasing image captioning models | |

Wed lab | Final Review | ||

Thu Dec 6 | Final Review | submit a course evaluation at | |

Tue Dec 11 | poster session 1:00-3:00pm in Hariri (there is another poster session starting right after so please take your posters down promptly) | project due Tue 11:55pm submission instructions Submit Here | |

Tue Dec 18 | Final exam 3:00pm-5:00pm CAS B12 | covers everything up to and including Reinforcement Learning II; expect questions on material covered in lectures, problem sets, LABs, and assigned reading | Additional practice problems |

*schedule is tentative and is subject to change.

Syllabus

This course is an introduction to modern machine learning concepts, techniques, and algorithms. Topics include regression, classification, unsupervised and supervised learning, kernels, support vector machines, feature selection, clustering, sequence models, and Bayesian methods. Weekly labs and projects emphasize taking theory into practice, through applications on real-world problems and data sets.

Course Pre-requisites

This is an upper-level undergraduate/intro graduate course and requires the following skills:

- Linear algebra (CAS CS 232 or MA 242 or equivalent)
- Calculus, including partial derivatives
- Probability (CAS CS 237 or MA 381 or 581 or equivalent)
- Working knowledge of programming (CAS CS 111 and 112, or equivalent)

Textbooks

The required textbook for the course is

- Bishop, C. M. Pattern Recognition and Machine Learning.

Other recommended supplemental textbooks on general machine learning:

- Duda, R.O., Hart, P.E., and Stork, D.G. Pattern Classiﬁcation. Wiley-Interscience. 2nd Edition. 2001.
- Theodoridis, S. and Koutroumbas, K. Pattern Recognition. Edition 4. Academic Press, 2008.
- Russell, S. and Norvig, N. Artiﬁcial Intelligence: A Modern Approach. Prentice Hall Series in Artiﬁcial Intelligence. 2003.
- Hastie, T., Tibshirani, R. and Friedman, J. The Elements of Statistical Learning. Springer. 2001.
- Koller, D. and Friedman, N. Probabilistic Graphical Models. MIT Press. 2009.
- Ian Goodfellow, Yoshua Bengio, Aaron Courville. Deep Learning.

Recommended background reading on matrix calculus:

- Reference reading on matrix calculus and linear algebra can be found here
- Matrix derivatives cheat sheet,

Recommended online courses

- https://www.coursera.org/learn/machine-learning/ Andrew Ng’s basic intro to Machine Learning course on Coursera, can be taken as a precursor to this course.

Deliverables/Graded Work

The main graded work for the course is the midterm, final and project. There will also be six self-graded homework assignments, each consisting of written and programming problems, which are meant to prepare students for the two exams. The project will be done in teams of 4 students and will have several deliverables including a proposal, progress update(s), code, report a final in-class presentation. The course grade consists of the following:

- Homeworks 20%
- Midterm 25%
- Final 25%
- Project 30%

Piazza

This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.

Projects

Students will apply their knowledge of machine learning to practical projects provided by local companies/nonprofits in collaboration with BU Spark! Teams will be able to choose from several projects and will interact with mentors from the partner institution as they develop their machine learning solution. The projects will culminate with a poster presentation at the end of term.

Project expectations:

- Students are expected (as part of homework) to attend the kickoff meeting with the client and proactively engage via slack with the client on any questions
- Students will submit an interim report
- Students will submit a final report and code to share with the partner

See the Projects page for more details.

Software/Hardware

Programming assignments will be developed in the Python programming language. You may use other languages for the projects, but note that the course staff may not be able to help answer questions specific to certain languages. If you do not already have a CS account and would like one, you should stop by the CS undergraduate lab (EMA 302) and activate one. This process takes only a few minutes, and can be done at any time during the lab's operating hours: <http://www.bu.edu/cs/resources/laboratories/undergraduate-lab/>

Late Policy

Late work will incur the following penalties

- 20% off per day, up to 2 days

Academic Honesty Policy

The instructors take academic honesty very seriously. Cheating, plagiarism and other misconduct may be subject to grading penalties up to failing the course. Students enrolled in the course are responsible for familiarizing themselves with the detailed BU policy, available here. In particular, plagiarism is defined as follows and applies to all written materials and software, including material found online. Collaboration on homework is allowed, but should be acknowledged and you should always come up with your own solution rather than copying (which is defined as plagiarism):

Plagiarism: Representing the work of another as one’s own. Plagiarism includes but is not limited to the following: copying the answers of another student on an examination, copying or restating the work or ideas of another person or persons in any oral or written work (printed or electronic) without citing the appropriate source, and collaborating with someone else in an academic endeavor without acknowledging his or her contribution. Plagiarism can consist of acts of commission-appropriating the words or ideas of another-or omission failing to acknowledge/document/credit the source or creator of words or ideas (see below for a detailed definition of plagiarism). It also includes colluding with someone else in an academic endeavor without acknowledging his or her contribution, using audio or video footage that comes from another source (including work done by another student) without permission and acknowledgement of that source.

Religious Observance

Students are permitted to be absent from class, including classes involving examinations, labs, excursions, and other special events, for purposes of religious observance. In-class, take-home and lab assignments, and other work shall be made up in consultation with the student’s instructors. More details on BU’s religious observance policy are available here.