Spring 2020: Deep Learning: Syllabus and Schedule
Time/Location: Mon/Wed 2:30-3:45pm in room MCS B33 / Zoom
Sections: CAS CS 591 S1 (will be CS 523)
Instructor: Kate Saenko, saenko@bu.edu
Teaching Assistant: Samarth Mishra, samarthm@bu.edu
Lecture Zoom: see Piazza Instructor Office Hours: Mon (4-5:30pm), Thu (10:30-12pm) on Zoom (see Piazza) Samarth’s Office Hours: Tue (1:30 pm - 3:00 pm), Wed (4:00 pm - 5:30 pm) on Zoom (see Piazza) |
Piazza: registered students can access via piazza.com/bu/spring2020/cs591s1spr20
Schedule*
Date | Topic | Details | Homework |
Jan 22 Wed | 1. Course overview | What is deep learning? DL successes; syllabus & course logistics; course prerequisites; projects | hw1 out |
Jan 27 Mon | 2. Machine Learning Review I | Cost functions, hypotheses and tasks; training data; maximum likelihood based cost, cross entropy, MSE cost; Gradient descent. Reading: Goodfellow Ch5.9-5.10 | |
Jan 29 Wed | 3. Machine Learning Review II | Probability, continuous and discrete distributions; maximum likelihood. Reading: Goodfellow Ch5.1-5.6 | |
Feb 3 Mon | 4. Intro to Neural Networks | logistic regression; feed-forward networks; perceptron; neuroscience inspiration; output vs hidden layers; Reading: Goodfellow Ch6.1-6.3, Perceptron | |
Feb 5 Wed | 5. Learning in Neural Networks | learning via gradient descent; recursive chain rule (backpropagation); Reading: backprop notes, Goodfellow Ch6.5.1-6.5.8 | hw1 due (11:59pm) hw2 out |
Feb 10 Mon | 6. Deep Learning Strategies I | Universality, architecture choices; activation functions; training, regularization, etc; Reading: Goodfellow Ch6.1-6.4 | |
Feb 12 Wed | 7. Deep Learning Strategies II | Regularization, data augmentation, dropout, batch normalization; Reading: GF Ch7.1,7.4,7.5,7.8,7.12,8.7.1; Batch normalization paper | |
Feb 18 Tue | 8. PyTorch Intro guest lecture by Samarth Mishra | bring your laptop to class to follow along | |
Feb 19 Wed | 9. CNNs I | Convolutional neural networks; filters, pooling layers; Reading: Goodfellow Ch9.1-9.3 | hw2 due (11:59pm) hw3 out |
Feb 24 Mon | 10. CNNs II guest lecture by Bryan Plummer | Convolutional neural networks cont’d. LeNet to ResNet. | |
Feb 26 Wed | 11. RNNs I | recurrent neural networks; sequence modeling; backpropagation through time; vanishing/exploding gradient problem; gradient clipping, long-short term memory (LSTM). Reading: Goodfellow Ch10 | |
Mar 2 Mon | 12. RNNs II | more intuition about RNNs, LSTMs; toy addition problem; language modeling; bi-directional RNN. Reading: Goodfellow Ch10 | Project proposal due printed, in class |
Mar 4 Wed | 13. Unsupervised deep learning I | Autoencoders. Reading: Ch14 | hw3 due (11:59pm) hw4 out |
SPRING BREAK | |||
Mar 16 Mon | 14. Unsupervised deep learning II | Generative Adversarial Networks, Reading: Ch 20.10.4, Goodfellow et al. 2014 | |
Mar 18 Wed | 15. Unsupervised deep learning III [lecture video] guest lecture by Ben Usman | Applications of generative models; Normalizing Flow models. Reading: see slides for references | |
Mar 23 Mon | 16. Variational Autoencoders I | Reading: Ch 20.10.3 | |
Mar 25 Wed | 17. Variational Autoencoders II | Cont’d, derivation of loss | hw4 due |
Mar 30 Mon | 18. Attention and Memory | Encoder-decoder RNNs, application to machine translation, attention; Reading: Neural Machine Translation by Jointly Learning to Align and Translate paper; http://distill.pub/2016/augmented-rnns/ (optional) | |
Apr 1 Wed | 19. Neural Turing Machines | Neural Turing Machines. Reading: Neural Turing Machines paper | hw4 due (11:59pm) hw5 out |
Apr 6 Mon | 20. Deep reinforcement learning II [LevineLec2] [LevineLec4]: watch before class. [lecture video] | In class: Q&A for material in assigned videos. Video “Reading”: Supervised learning of behaviors, Imitation learning (LevineLec2slides), overview of reinforcement learning, types of RL algorithms (LevineLec4slides) | Progress report due (11:59pm) Template Submit Here |
Apr 8 Wed | 21. Deep Reinforcement Learning III [LevineLec5] [LevineLec6]: watch before class. [lecture video] | In class: Q&A for material in assigned videos. Video “Reading”: Policy Gradient (LevineLec5slides), Actor-critic, Q-learning (LevineLec6slides) | |
Apr 13 Mon | 22. Applications I: Computer Vision | Object detection; semantic segmentation; video classification.Video Reading: Stanford’s cs231n, spring 2019, Lecture 12, video | |
Apr 15 Wed | 23. Applications II: Language and Vision | Image and video captioning, visual question answering, visual dialog, phrase grounding in images, visual navigation | hw5 EXTENDED: due Fri April 17 (11:59pm) hw6 out, optional |
Apr 20 Mon | NO CLASS (Patriot’s Day) | ||
Apr 22 Wed | 24. Applications III: NLP, Speech and Audio | Natural language processing (NLP) applications; self-attention, Transformer; Reading: Attention is All You Need paper); WaveNet paper (optional) | |
Apr 27 Mon | Project presentations I: in class (80 min) | Teams present their project results; mandatory attendance for all | slides due (12:00pm) on Piazza presentation instructions |
Apr 29 Wed | Project Presentations II: in class (80 min) | Teams present their project results; mandatory attendance for all | |
May 1 Fri | No class | Project report due at 5:00pm |
*schedule is tentative and is subject to change.
Course Description
This course is an introduction to deep learning, a branch of machine learning concerned with the development and application of modern neural networks. Deep learning algorithms extract layered high-level representations of data in a way that maximizes performance on a given task. For example, asked to recognize faces, a deep neural network may learn to represent image pixels first with edges, followed by larger shapes, then parts of the face like eyes and ears, and, finally, individual face identities. Deep learning is behind many recent advances in AI, including Siri’s speech recognition, Facebook’s tag suggestions and self-driving cars. We will cover a range of topics from basic neural networks, convolutional and recurrent network structures, deep unsupervised and reinforcement learning, and applications to problem domains like text generation and computer vision.
Course Prerequisites
This is a graduate course that is also open to undergraduates with sufficient background. All students should have the following skills:
If you lack any of the above prerequisites, you probably should not take the course, so please talk to the instructor. The first homework will test your knowledge of basic machine learning concepts, please use it as a self-assessment if you are not sure if you have enough background. Students must be able to do well on HW1 to do well in the course.
Textbook
The required textbook for the course is
Other recommended supplemental textbooks on general machine learning:
Recommended online courses
Deliverables/Graded Work
There will be six homework assignments, each consisting of written and/or coding problems, and a final project. The project will be done in teams of 2 students and will have several deliverables including a proposal, progress updates, final report and a final in-class presentation. The course grade consists of the following (note, HW6 is now optional):
Software/Hardware
Programming assignments and projects will be developed in the Python programming language. We will also use the PyTorch deep learning library for some homeworks and for the project. Students are expected to use the Shared Computing Cluster (SCC) and/or their own machines to complete work that does not require a GPU. For the projects, we will provide GPU resources.
If you do not already have a CS account and would like one, see here: http://www.bu.edu/cs/resources/laboratories/undergraduate-lab/
Checking GPU usage on SCC
scc2 ~ % qgpus -h Usage -- gpus_new.pl [-v|-q qname|-h]
For example:
# Give full summary for each GPU node: scc2 ~ % qgpus -v
# Give GPUs availability for a particular queue: scc2 ~ % qgpus -q csgpu host gpu_type cpu_ cpu_ gpu_ gpu_ gpu_ queue_list total in_use total in_use avail -------- -------- ----- ------ ----- ------ ----- --------------- scc-c12 P100 28 16 4 4 0 csgpu,csgpu-pub scc-k11 V100 28 8 2 2 0 csgpu,csgpu-pub |
Projects
The projects are open-ended and should be done in teams of two. The deliverables include:
The page length above is based on a 2-person team, and should be scaled accordingly if for some reason your team is larger. The project should involve significant implementation or derivation effort (in terms of written lines of code or proofs if theoretical in nature). For example, downloading existing code and running it on an existing dataset without additional implementation effort is not adequate. On the other hand, re-implementing a research paper from scratch and testing the implementation to try to reproduce the results is an example of a good project.
Some useful info: sample projects, project proposal template, update template, final report template (same as update but replace preliminary results with final results and conclusions).
Late Policy
Late work will incur the following penalties
Academic Honesty Policy
The instructors take academic honesty very seriously. Cheating, plagiarism and other misconduct may be subject to grading penalties up to failing the course. Students enrolled in the course are responsible for familiarizing themselves with the detailed BU policy, available here. In particular, plagiarism is defined as follows and applies to all written materials and software, including material found online. Collaboration on homework is allowed, but should be acknowledged and you should always come up with your own solution rather than copying (which is defined as plagiarism):
Plagiarism: Representing the work of another as one’s own. Plagiarism includes but is not limited to the following: copying the answers of another student on an examination, copying or restating the work or ideas of another person or persons in any oral or written work (printed or electronic) without citing the appropriate source, and collaborating with someone else in an academic endeavor without acknowledging his or her contribution. Plagiarism can consist of acts of commission-appropriating the words or ideas of another-or omission failing to acknowledge/document/credit the source or creator of words or ideas (see below for a detailed definition of plagiarism). It also includes colluding with someone else in an academic endeavor without acknowledging his or her contribution, using audio or video footage that comes from another source (including work done by another student) without permission and acknowledgement of that source.
Religious Observance
Students are permitted to be absent from class, including classes involving examinations, labs, excursions, and other special events, for purposes of religious observance. In-class, take-home and lab assignments, and other work shall be made up in consultation with the student’s instructors. More details on BU’s religious observance policy are available here.