CS 584 Syllabus
Fall 2017
This course will cover seminal recent research papers across topics in distributed computer systems, with a focus on managing big data. Topics may include communication paradigms, process management, naming, synchronization, consistency and replication, fault tolerance, storage architectures, high-performance file systems, data provenance, and next-generation storage devices and architectures, including those at Google, Yahoo, and Amazon. Throughout the course, we will discuss the tradeoffs made between performance, reliability, scalability, robustness, and security.
Instructor
Prof. Avani Wildani (Dr. Will)
http://www.mathcs.emory.edu/~avani Email: avani@mathcs.emory.edu
Office: MSC W412 Office Hours: By appt
Textbook
“Principles of Computer Systems Design: An Introduction” - Saltzer et al.
Exercises will be assigned out of this book, and it’s a good first resource
Discussion
All discussion will be on Piazza at https://piazza.com/emory/fall2017/cs584/home
Grading
- Paper Summaries
- Paper Presentation
- Discussion
- Class Project (50%)
- Homework (20%)
Summary Writing Instructions
A major component of this course will be the in-class discussion of papers on research in operating systems. Typically, you will need to read one paper per class; the reading list is available online, and all of the papers are available as links from the reading list. These papers should be read carefully, and a short (1-2 paragraph) summary of each article and a few questions or insightful comments about the material (at least 3 per paper) prepared for the class meeting in which the paper will be discussed. The summary of each article consists of brief answers to the following five questions, and three comments or questions about the paper (the sixth item):
- What is the problem the authors are trying to solve
- Why is it interesting, relevant, and/or important?
- What other approaches or solutions existed at the time that this work was done?
- What was wrong with the other approaches or solutions?
- What is the authors' approach or solution, and how does it compare to earlier approaches or solutions?
- Three or more comments/questions about the paper.
Class Project
Students in the class must complete a research project in the general area of operating systems. Both a paper describing the project and a poster presentation will be required. This project should be the results of experimental research (strongly preferred) or a strong survey of prior art in a focused area.
Your project should take approximately 60–80 hours over the course of the quarter, including time to read background material, build and run your experiments, and write up your results.
If you want to work with someone else in the class on your project, you may do so with prior approval (i.e., please see me before doing this). If you work with a partner, the expectations for the scope of your project will be adjusted accordingly.
ALL PAPERS MUST BE IN LATEX
There will be checkpoints during the semester to keep you on schedule to complete your project. Checkpoints:
9/26 : Project Proposal Due
10/12: Bibliography Due
10/31: Research Plan Due
11/30: Poster Presentation of Results
12/16: Paper Due
Note: this is the LAST DAY of the term. I am literally giving you all of the time I can, so there can be no extensions.
What is a Proposal?
In your project proposal, you will put together a few sentences (no more than a paragraph) that define a problem you find interesting and propose a project that addresses some part of that problem. This could include:
- Replicating an old study with new data or methods.
- Implementing a framework proposed in a paper and comparing empirical results with theory.
- Designing a new algorithm to solve an existing problem.
What is a Bibliography?
For your annotated bibliography, I want to see a list of sources and a sentence or two about what the paper/book/thesis/article contains that is relevant to your work. Since your paper will be in LaTeX, I recommend making a BibTeX bibliography. If you do, it’s perfectly fine to turn in the raw .bib file with your annotations in a @comment{} field below each entry.
For a 1-person project, I expect ~30 sources, but this number will vary based on topic area.
What is a Research Plan?
This will be a *detailed plan* for implementing your system. At this point, you should have an outline of the paper, with your introduction and background sections written. This is, in effect, your “Methods” section. It should include the data you intend to gather, the specific experiments you intend to run, the graphs you expect these experiments to produce, and possible extensions if these experiments do not go as planned. This should be at least 2 pages.
What goes on the poster?
Your poster should be an academic research poster that presents your problem statement, core ideas (e.g. an architecture diagram, a single set of equations, or a *very* succinctly written bulleted list), relevant graphs that support your hypothesis, and a few bullet points to help viewers interpret your graphs. There will be an in class poster session on the last day of class that will be open to the entire department, so make certain you can discuss your ideas with clever non-experts!
Sample posters are available here and here. (links TBA)
What goes in the paper?
You will have read many research papers by this point in the term, and that is precisely what you will now be writing. An academic research paper typically has the following sections:
- Abstract: 4 sentences that identify the problem, say why it hasn’t been fixed until now, identify your solution, and tells us why your solution is awesome.
- Introduction: Motivate your problem domain and make people care. Include your main contributions up front and center.
- Background/Related work: Place your solution in context with existing work in the field.
- Methods: Explain your approach.
- Experiments: Prove your methods are fantastic
- Discussion: Talk about what your results show and, more importantly, *don’t* show. What can be correctly extrapolated? Can your method be applied to a broader scope? What edge cases are you knowingly not covering?
- Conclusion: Short and sweet summary including the key take-aways you want the reader to have and possible expansions of this work in the future.
Course Schedule
Date | Lecture | Readings | Presenter |
8/24 | Intro | Ch. 1 | None |
8/29 | Systems Background | Ch. 1 The UNIX Time Sharing System | Avani (slides) |
8/31 | Naming in Systems | Ch. 2.2, 2.3 LFS (Required), Plan9 (optional) | Eli (slides) |
9/5 | Modularity in Networks (Client/Server model) | Ch. 4.1, 4.2 END-TO-END ARGUMENTS IN SYSTEM DESIGN | Liquan |
9/7 | Modularity in memory and Virtualization | Ch. 5.1, 5.3 Exokernel: An Operating System Architecture for Application-Level Resource Management (required), Efficient virtual memory for big memory servers (optional) | Pranav (slides) |
9/12 | Virtual Machines -CANCELLED for hurricane | Ch. 5.2, 5.8 Xen |
|
9/14 | Performance | Ch. 6.1 When Slower is Faster: On Heterogeneous Multicores for Reliable Systems (required) Morpheus: Towards Automated SLOs for Enterprise Clusters (optional), |
Huan |
9/19 | Scheduling | Ch. 6.3 Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization | Reza |
9/21 | Networking Basics + SDNs
(Guest Speaker: Sergio Gramacho) | Ch. 7 overview, 7.1 OpenFlow: Enabling Innovation in Campus Networks (required) NOX: Towards an Operating System for Networks (Referred to during class)
| Pranav |
9/26 |
PROJECT PROPOSALS DUE
Networking Continued | Ch. 7.2-7.5 (skim 3-5) Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering (required) Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network (optional)
|
|
9/28 | Cloud Computing 1: Infrastructure | Large-scale cluster management at Google with Borg |
|
10/3 | Blockchains (Guest Speaker: Jason!) | Bitcoin: A Peer-to-Peer Electronic Cash System | Eli |
10/5 | Bitcoin Redux and Information Security
(Guest Speaker: Prof. Ymir Vigfusson) | Ch. 11.1, 11.4, 11.5 | N/A |
10/10 | Fall Break |
|
|
10/12 | Cloud Computing 2: Data Processing, Serverless Models
BIBLIOGRAPHY DUE | MapReduce: Simplified Data Processing on Large Clusters | Everyone |
10/17 |
Project Work Day (No Class)
|
|
|
10/19 |
Data Management
| Ch. 7.6, 7.9 Bigtable: A Distributed Storage System for Structured Data (required) f4: Facebook's Warm BLOB Storage System (optional) | Reza |
10/24 | File Systems
| The Google File System (required for discussion) A Fast File System for UNIX (required background) | Pranav |
10/26 | Atomicity / Consistency / Concurrency
(The Paxos Lecture)
| Ch. 9.1, 9.2, 9.4 Paxos Made Simple (required) In Search of an Understandable Consensus Algorithm (required) The Part-Time Parliament (Original Paxos paper: optional) | Avani |
10/31 | Caching
RESEARCH PLAN DUE | Ch. 10.2 Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility (required) ARC: A Self-Tuning, Low Overhead Replacement Cache (optional) | Huan |
11/2 | Fault Tolerance 1 |
|
|
11/7 | Timing
| Ch. 8.1, 8.2, 9.7 Low-Overhead Byzantine Fault-Tolerant Storage Ch 9.3 Time, clocks, and the ordering of events in a distributed system | Reza / Avani |
11/9 | Fault Tolerance 2
| Ch. 8.3, 8.5, 8.6 8.8 (optional, but loads of fun) A case for redundant arrays of inexpensive disks (RAID) (required) Ursa Minor: Versatile Cluster-based Storage | Reza |
11/14 | Fault Tolerance 2 | Ch. 8.3, 8.5, 8.6 8.8 (optional, but loads of fun) A case for redundant arrays of inexpensive disks (RAID) (required) Ursa Minor: Versatile Cluster-based Storage | Reza |
11/16 | Distributed Systems | Tango: distributed data structures over a shared log Spanner: Google’s Globally-Distributed Database (optional) | Huan |
11/21 | Data in the Cloud | Ceph | Eli |
11/28 | Storage Misc: DHTs, Object Storage, Tracing | Crystal: Software-Defined Storage for Multi-Tenant Object Stores | Pranav |
11/30 | Systems in Nature
| Computational principles of memory | Avani |
12/5 | Poster Presentations
FINAL PAPER DUE : 12/16 | None! |
|