CS 584 Syllabus

Spring 2016

This course will cover seminal recent research papers across topics in distributed computer systems, with a focus on managing big data. Topics may include communication paradigms, process management, naming, synchronization, consistency and replication, fault tolerance, storage architectures, high-performance file systems, data provenance, and next-generation storage devices and architectures, including those at Google, Yahoo, and Amazon.  Throughout the course, we will discuss the tradeoffs made between performance, reliability, scalability, robustness, and security.  

Instructor

Prof. Avani Wildani (Dr. Will)

http://www.mathcs.emory.edu/~avani          Email: avani@mathcs.emory.edu

Office:  MSC W412                                        Office Hours: T 4-6pm, Th 9-11am + by appt         

Textbook

Principles of Computer Systems Design: An Introduction” - Saltzer et al.        

Exercises will be assigned out of this book, and it’s a good first resource                          

Discussion

All discussion will be on Piazza at https://piazza.com/emory/spring2016/cs584/home

Grading                                                

Summary Writing Instructions                                        

                                                                   

A major component of this course will be the in-class discussion of papers on research in operating systems. Typically, you will need to read one paper per class; the reading list is available online, and all of the papers are available as links from the reading list. These papers should be read carefully, and a short (1-2 paragraph) summary of each article and a few questions or insightful comments about the material (at least 3 per paper) prepared for the class meeting in which the paper will be discussed. The summary of each article consists of brief answers to the following five questions, and three comments or questions about the paper (the sixth item):

  1. What is the problem the authors are trying to solve
  2. Why is it interesting, relevant, and/or important?
  3. What other approaches or solutions existed at the time that this work was done?
  4. What was wrong with the other approaches or solutions?
  5. What is the authors' approach or solution, and how does it compare to earlier approaches or solutions?
  6. Three or more comments/questions about the paper.                                                

Class Project

                                        

Students in the class must complete a research project in the general area of operating systems. Both a paper describing the project and a poster presentation will be required. This project should be the results of experimental research (strongly preferred) or a strong survey of prior art in a focused area.

Your project should take approximately 60–80 hours over the course of the quarter, including time to read background material, build and run your experiments, and write up your results.

If you want to work with someone else in the class on your project, you may do so with prior approval (i.e., please see me before doing this). If you work with a partner, the expectations for the scope of your project will be adjusted accordingly.

There will be checkpoints during the semester to keep you on schedule to complete your project. The checkpoints are:

                                                        

Course Schedule

Date

Lecture

Readings

Presenter

1/12

Intro

Ch. 1

None

1/14

Systems Background

Ch. 1

The UNIX Time Sharing System 

Avani

(slides)

1/19

Naming in Systems

Ch. 2.2, 2.3

LFS (Required), Plan9 (optional)

Sergio

(slides)

1/21

Naming continued

Ch. 3.1, 3.3

Persistent Personal Names for Globally Connected Mobile Devices

Jason

1/26

DNS 

Ch. 3.2 (skim), 4.4

A Survey of Naming and Routing in Information-Centric Networks (required)

The Design and Implementation of an Intentional Naming System (optional)

Chris

1/28

Modularity in Networks (Client/Server model)

Ch. 4.1, 4.2

END-TO-END ARGUMENTS IN SYSTEM DESIGN

Artie

2/2

Modularity in memory and Virtualization

Ch. 5.1, 5.3, 5.4

Exokernel: An Operating System Architecture for Application-Level Resource Management (required),

Efficient virtual memory for big memory servers (optional)

Avani

2/4

Virtual Machines

Ch. 5.2, 5.8

Xen

Jason

2/9

Performance 

Ch. 6.1

SEDA: an architecture for well-conditioned, scalable internet services (required),

Arrakis: The Operating System is the Control Plane (optional)

Clarissa

2/11

Scheduling

PROJECT PROPOSALS DUE

Ch. 6.3

FlexSC: Flexible System Call Scheduling with Exception-Less System Calls

Artie

2/16

Networking Basics + SDNs

Ch. 7 overview, 7.1

OpenFlow: Enabling Innovation in Campus Networks (required)

NOX: Towards an Operating System for Networks (Referred to during class)

Ethernet: Distributed Packet Switching for Local Computer Networks (optional)

Sergio

2/18

Network Layers 

Ch. 7.2-7.5 (skim 3-5)

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network (required)

Condor: Better Topologies through Declarative Design (optional)

Sergio

2/23

Guest Speaker: Dr. Sunderam

No Paper!

None

2/25

Blockchains and P2P

Ch. 11.9, 11.11

Bitcoin: A Peer-to-Peer Electronic Cash System (required)

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications (optional; highly recommended)

Jinfei

3/1

Bitcoin Redux and Information Security

Ch. 11.1, 11.4, 11.5

3/3

Ch. 7.6, 7.9

Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm

TAO: Facebook’s Distributed Data Store for the Social Graph

Jinfei, Jason

3/15

Data Management

BIBLIOGRAPHY DUE

Bigtable: A Distributed Storage System for Structured Data (required)

f4: Facebook's Warm BLOB Storage System (optional)

Jinfei

3/17

File Systems

The Google File System (required for discussion)

A Fast File System for UNIX (required background)

Chris

3/22

Atomicity / Consistency / Concurrency

(The Paxos Lecture)

Ch. 9.1, 9.2, 9.4

Paxos Made Simple (required)

In Search of an Understandable Consensus Algorithm (required)

The Part-Time Parliament (Original Paxos paper: optional)

Avani

3/24

Cacheing

Ch. 10.2

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility (required)

ARC: A Self-Tuning, Low Overhead Replacement Cache (optional)

Chris

3/29

Storage Misc:

DHTs, Object Storage, Tracing

IOFlow: a software-defined storage architecture 

Clarissa

3/31

Fault Tolerance

RESEARCH PLAN DUE

Ch. 8.1, 8.2, 9.7 Low-Overhead Byzantine Fault-Tolerant Storage

Artie

4/5

Fault Tolerance

Ch. 8.3, 8.5, 8.6

       8.8 (optional, but loads of fun)

A case for redundant arrays of inexpensive disks (RAID)

Ursa Minor: Versatile Cluster-based Storage.

Jason

4/7

Distributed Systems

Tango: distributed data structures over a shared log

Spanner: Google’s Globally-Distributed Database

Artie

4/12

Cloud Computing 

MapReduce: Simplified Data Processing on Large Clusters

Clarissa

4/14

Data in the Cloud

Starfish: A Self-tuning System for Big Data Analytics

Roundtable

4/19

Systems in Nature

Computational principles of memory

Avani

4/21

Poster Presentations

FINAL PAPER DUE : 4/25

None!

Everyone