CS 584 Syllabus
Spring 2016
This course will cover seminal recent research papers across topics in distributed computer systems, with a focus on managing big data. Topics may include communication paradigms, process management, naming, synchronization, consistency and replication, fault tolerance, storage architectures, high-performance file systems, data provenance, and next-generation storage devices and architectures, including those at Google, Yahoo, and Amazon. Throughout the course, we will discuss the tradeoffs made between performance, reliability, scalability, robustness, and security.
Instructor
Prof. Avani Wildani (Dr. Will)
http://www.mathcs.emory.edu/~avani Email: avani@mathcs.emory.edu
Office: MSC W412 Office Hours: T 4-6pm, Th 9-11am + by appt
Textbook
“Principles of Computer Systems Design: An Introduction” - Saltzer et al.
Exercises will be assigned out of this book, and it’s a good first resource
Discussion
All discussion will be on Piazza at https://piazza.com/emory/spring2016/cs584/home
Grading
- Paper Summaries
- Paper Presentation
- Discussion
Summary Writing Instructions
A major component of this course will be the in-class discussion of papers on research in operating systems. Typically, you will need to read one paper per class; the reading list is available online, and all of the papers are available as links from the reading list. These papers should be read carefully, and a short (1-2 paragraph) summary of each article and a few questions or insightful comments about the material (at least 3 per paper) prepared for the class meeting in which the paper will be discussed. The summary of each article consists of brief answers to the following five questions, and three comments or questions about the paper (the sixth item):
- What is the problem the authors are trying to solve
- Why is it interesting, relevant, and/or important?
- What other approaches or solutions existed at the time that this work was done?
- What was wrong with the other approaches or solutions?
- What is the authors' approach or solution, and how does it compare to earlier approaches or solutions?
- Three or more comments/questions about the paper.
Class Project
Students in the class must complete a research project in the general area of operating systems. Both a paper describing the project and a poster presentation will be required. This project should be the results of experimental research (strongly preferred) or a strong survey of prior art in a focused area.
Your project should take approximately 60–80 hours over the course of the quarter, including time to read background material, build and run your experiments, and write up your results.
If you want to work with someone else in the class on your project, you may do so with prior approval (i.e., please see me before doing this). If you work with a partner, the expectations for the scope of your project will be adjusted accordingly.
There will be checkpoints during the semester to keep you on schedule to complete your project. The checkpoints are:
- February 11th - Project proposals (1 paragraph)
- March 15th - Bibliography
- March 31st - Research plan / Implementation outline
- April 25th - Final papers + posters due
Course Schedule
Date | Lecture | Readings | Presenter |
1/12 | Intro | Ch. 1 | None |
1/14 | Systems Background | Ch. 1 The UNIX Time Sharing System | Avani (slides) |
1/19 | Naming in Systems | Ch. 2.2, 2.3 LFS (Required), Plan9 (optional) | Sergio (slides) |
1/21 | Naming continued | Ch. 3.1, 3.3 Persistent Personal Names for Globally Connected Mobile Devices | Jason |
1/26 | DNS | Ch. 3.2 (skim), 4.4 A Survey of Naming and Routing in Information-Centric Networks (required) The Design and Implementation of an Intentional Naming System (optional) | Chris |
1/28 | Modularity in Networks (Client/Server model) | Ch. 4.1, 4.2 END-TO-END ARGUMENTS IN SYSTEM DESIGN | Artie |
2/2 | Modularity in memory and Virtualization | Ch. 5.1, 5.3, 5.4 Exokernel: An Operating System Architecture for Application-Level Resource Management (required), Efficient virtual memory for big memory servers (optional) | Avani
|
2/4 | Virtual Machines | Ch. 5.2, 5.8 Xen | Jason |
2/9 | Performance | Ch. 6.1 SEDA: an architecture for well-conditioned, scalable internet services (required), Arrakis: The Operating System is the Control Plane (optional) | Clarissa |
2/11 | Scheduling
PROJECT PROPOSALS DUE
| Ch. 6.3 FlexSC: Flexible System Call Scheduling with Exception-Less System Calls | Artie |
2/16 | Networking Basics + SDNs | Ch. 7 overview, 7.1 OpenFlow: Enabling Innovation in Campus Networks (required) NOX: Towards an Operating System for Networks (Referred to during class) Ethernet: Distributed Packet Switching for Local Computer Networks (optional) | Sergio |
2/18 | Network Layers | Ch. 7.2-7.5 (skim 3-5) Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network (required) Condor: Better Topologies through Declarative Design (optional) | Sergio |
2/23 | Guest Speaker: Dr. Sunderam | No Paper! | None |
2/25 | Blockchains and P2P | Ch. 11.9, 11.11 Bitcoin: A Peer-to-Peer Electronic Cash System (required) Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications (optional; highly recommended) | Jinfei |
3/1 | Bitcoin Redux and Information Security | Ch. 11.1, 11.4, 11.5 |
|
3/3 |
| Ch. 7.6, 7.9 Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm TAO: Facebook’s Distributed Data Store for the Social Graph | Jinfei, Jason |
3/15 | Data Management
BIBLIOGRAPHY DUE | Bigtable: A Distributed Storage System for Structured Data (required) f4: Facebook's Warm BLOB Storage System (optional) | Jinfei |
3/17 | File Systems
| The Google File System (required for discussion) A Fast File System for UNIX (required background) | Chris |
3/22 | Atomicity / Consistency / Concurrency
(The Paxos Lecture) | Ch. 9.1, 9.2, 9.4 Paxos Made Simple (required) In Search of an Understandable Consensus Algorithm (required) The Part-Time Parliament (Original Paxos paper: optional) | Avani |
3/24 | Cacheing | Ch. 10.2 Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility (required) ARC: A Self-Tuning, Low Overhead Replacement Cache (optional) | Chris |
3/29 | Storage Misc: DHTs, Object Storage, Tracing | IOFlow: a software-defined storage architecture | Clarissa |
3/31 | Fault Tolerance
RESEARCH PLAN DUE | Ch. 8.1, 8.2, 9.7 Low-Overhead Byzantine Fault-Tolerant Storage | Artie |
4/5 | Fault Tolerance | Ch. 8.3, 8.5, 8.6 8.8 (optional, but loads of fun) A case for redundant arrays of inexpensive disks (RAID) Ursa Minor: Versatile Cluster-based Storage. | Jason |
4/7 | Distributed Systems | Tango: distributed data structures over a shared log Spanner: Google’s Globally-Distributed Database | Artie |
4/12 | Cloud Computing | MapReduce: Simplified Data Processing on Large Clusters | Clarissa |
4/14 | Data in the Cloud | Starfish: A Self-tuning System for Big Data Analytics | Roundtable |
4/19 | Systems in Nature | Computational principles of memory | Avani |
4/21 | Poster Presentations
FINAL PAPER DUE : 4/25 | None! | Everyone |