CS 584 Syllabus

Fall 2017

This course will cover seminal recent research papers across topics in distributed computer systems, with a focus on managing big data. Topics may include communication paradigms, process management, naming, synchronization, consistency and replication, fault tolerance, storage architectures, high-performance file systems, data provenance, and next-generation storage devices and architectures, including those at Google, Yahoo, and Amazon.  Throughout the course, we will discuss the tradeoffs made between performance, reliability, scalability, robustness, and security.  

Instructor

Prof. Avani Wildani (Dr. Will)

http://www.mathcs.emory.edu/~avani          Email: avani@mathcs.emory.edu

Office:  MSC W412                                        Office Hours: By appt         

Textbook

Principles of Computer Systems Design: An Introduction” - Saltzer et al.        

Exercises will be assigned out of this book, and it’s a good first resource                          

Discussion

All discussion will be on Piazza at https://piazza.com/emory/fall2017/cs584/home

Grading                                                

Summary Writing Instructions                                        

                                                                   

A major component of this course will be the in-class discussion of papers on research in operating systems. Typically, you will need to read one paper per class; the reading list is available online, and all of the papers are available as links from the reading list. These papers should be read carefully, and a short (1-2 paragraph) summary of each article and a few questions or insightful comments about the material (at least 3 per paper) prepared for the class meeting in which the paper will be discussed. The summary of each article consists of brief answers to the following five questions, and three comments or questions about the paper (the sixth item):

  1. What is the problem the authors are trying to solve
  2. Why is it interesting, relevant, and/or important?
  3. What other approaches or solutions existed at the time that this work was done?
  4. What was wrong with the other approaches or solutions?
  5. What is the authors' approach or solution, and how does it compare to earlier approaches or solutions?
  6. Three or more comments/questions about the paper.                                                

Class Project

                                        

Students in the class must complete a research project in the general area of operating systems. Both a paper describing the project and a poster presentation will be required. This project should be the results of experimental research (strongly preferred) or a strong survey of prior art in a focused area.

Your project should take approximately 60–80 hours over the course of the quarter, including time to read background material, build and run your experiments, and write up your results.

If you want to work with someone else in the class on your project, you may do so with prior approval (i.e., please see me before doing this). If you work with a partner, the expectations for the scope of your project will be adjusted accordingly.

ALL PAPERS MUST BE IN LATEX

There will be checkpoints during the semester to keep you on schedule to complete your project. Checkpoints:

9/26 :                   Project Proposal Due

10/12:                Bibliography Due

10/31:                Research Plan Due

11/30:                   Poster Presentation of Results

12/16:                Paper Due

Note: this is the LAST DAY of the term.  I am literally giving you all of the time I can, so there can be no extensions.

What is a Proposal?

In your project proposal, you will put together a few sentences (no more than a paragraph) that define a problem you find interesting and propose a project that addresses some part of that problem.  This could include:

What is a Bibliography?

For your annotated bibliography, I want to see a list of sources and a sentence or two about what the paper/book/thesis/article contains that is relevant to your work.  Since your paper will be in LaTeX, I recommend making a BibTeX bibliography.  If you do, it’s perfectly fine to turn in the raw .bib file with your annotations in a @comment{} field below each entry.

For a 1-person project, I expect ~30 sources, but this number will vary based on topic area.

What is a Research Plan?

This will be a *detailed plan* for implementing your system.  At this point, you should have an outline of the paper, with your introduction and background sections written.  This is, in effect, your “Methods” section.  It should include the data you intend to gather, the specific experiments you intend to run, the graphs you expect these experiments to produce, and possible extensions if these experiments do not go as planned.  This should be at least 2 pages.

What goes on the poster?

Your poster should be an academic research poster that presents your problem statement, core ideas (e.g. an architecture diagram, a single set of equations, or a *very* succinctly written bulleted list), relevant graphs that support your hypothesis, and a few bullet points to help viewers interpret your graphs.  There will be an in class poster session on the last day of class that will be open to the entire department, so make certain you can discuss your ideas with clever non-experts!

Sample posters are available here and here. (links TBA)

What goes in the paper?

You will have read many research papers by this point in the term, and that is precisely what you will now be writing.  An academic research paper typically has the following sections:

Course Schedule

Date

Lecture

Readings

Presenter

8/24

Intro

Ch. 1

None

8/29

Systems Background

Ch. 1

The UNIX Time Sharing System 

Avani

(slides)

8/31

Naming in Systems

Ch. 2.2, 2.3

LFS (Required), Plan9 (optional)

Eli

(slides)

9/5

Modularity in Networks (Client/Server model)

Ch. 4.1, 4.2

END-TO-END ARGUMENTS IN SYSTEM DESIGN

Liquan

9/7

Modularity in memory and Virtualization

Ch. 5.1, 5.3

Exokernel: An Operating System Architecture for Application-Level Resource Management (required),

Efficient virtual memory for big memory servers (optional)

Pranav

(slides)

9/12

Virtual Machines -CANCELLED for hurricane

Ch. 5.2, 5.8

Xen

9/14

Performance

Ch. 6.1

When Slower is Faster: On Heterogeneous Multicores for Reliable Systems (required)

Morpheus: Towards Automated SLOs for Enterprise Clusters (optional),

Huan

9/19

Scheduling

Ch. 6.3

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Reza

9/21

Networking Basics + SDNs

(Guest Speaker: Sergio Gramacho)

Ch. 7 overview, 7.1

OpenFlow: Enabling Innovation in Campus Networks (required)

NOX: Towards an Operating System for Networks (Referred to during class)

Pranav

9/26

PROJECT PROPOSALS DUE

Networking Continued

Ch. 7.2-7.5 (skim 3-5)

Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering

(required)

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network (optional)

9/28

Cloud Computing 1: Infrastructure

Large-scale cluster management at Google with Borg

10/3

Blockchains

(Guest Speaker: Jason!)

Bitcoin: A Peer-to-Peer Electronic Cash System 

Eli

10/5

Bitcoin Redux and Information Security

(Guest Speaker: Prof. Ymir Vigfusson)

Ch. 11.1, 11.4, 11.5

N/A

10/10

Fall Break

10/12

Cloud Computing 2: Data Processing, Serverless Models

BIBLIOGRAPHY DUE

MapReduce: Simplified Data Processing on Large Clusters

Everyone

10/17

Project Work Day (No Class)

10/19

Data Management

Ch. 7.6, 7.9

Bigtable: A Distributed Storage System for Structured Data (required)

f4: Facebook's Warm BLOB Storage System (optional)

Reza

10/24

File Systems

The Google File System (required for discussion)

A Fast File System for UNIX (required background)

Pranav

10/26

Atomicity / Consistency / Concurrency 

(The Paxos Lecture)

Ch. 9.1, 9.2, 9.4

Paxos Made Simple (required)

In Search of an Understandable Consensus Algorithm (required)

The Part-Time Parliament (Original Paxos paper: optional)

Avani

10/31

Caching

RESEARCH PLAN DUE

Ch. 10.2

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility (required)

ARC: A Self-Tuning, Low Overhead Replacement Cache (optional)

Huan

11/2

Fault Tolerance 1

11/7

Timing

Ch. 8.1, 8.2, 9.7 Low-Overhead Byzantine Fault-Tolerant Storage

Ch 9.3 Time, clocks, and the ordering of events in a distributed system

Reza / Avani

11/9

Fault Tolerance 2

Ch. 8.3, 8.5, 8.6

       8.8 (optional, but loads of fun)

A case for redundant arrays of inexpensive disks (RAID) (required)

Ursa Minor: Versatile Cluster-based Storage

Reza

11/14

Fault Tolerance 2

Ch. 8.3, 8.5, 8.6

       8.8 (optional, but loads of fun)

A case for redundant arrays of inexpensive disks (RAID) (required)

Ursa Minor: Versatile Cluster-based Storage

Reza

11/16

Distributed Systems

Tango: distributed data structures over a shared log

Spanner: Google’s Globally-Distributed Database (optional)

Huan

11/21

Data in the Cloud

Ceph

Eli

11/28

Storage Misc:

DHTs, Object Storage, Tracing

Crystal: Software-Defined Storage for Multi-Tenant Object Stores

Pranav

11/30

Systems in Nature

Computational principles of memory

Avani

12/5

Poster Presentations

FINAL PAPER DUE : 12/16

None!