Winter Semester 2014-2015
All of you should read the general reading papers:
The following topics are available:
(1) Parallel Processing Theory
Upper and lower bounds on the cost of a map-reduce computation
Making queries tractable on big data with preprocessing
(2) Query optimization
Opening the black boxes in data flow optimization
Continuous cloud-scale query optimization and processing
(3) Data-parallel programming
Jet: An Embedded DSL for High Performance Big Data Processing
(4) In-memory databases
HyPer: A Hybrid OLTP&OLAP Main Memory Database System Based on Virtual Memory Snapshots
OLTP through the looking glass, and what we found there
(5) Indexing
Only Aggressive Elephants are Fast Elephants
(6) Visualization and interaction
Enterprise Data Analysis and Visualization: An Interview Study
dbTouch: Analytics at your Fingertips
(7) Parallel machine learning
SystemML: Declarative machine learning on MapReduce
Distributed stochastic gradient descent
(8) Transactions
Spanner: Google’s globally-distributed database*
Data-oriented transaction execution
(9) Graphs
GraphLab: A New Framework For Parallel Machine Learning
Spinning fast iterative dataflows
(10) In-database analytics
Towards a unified architecture for in-RDBMS analytics
ArrayStore: A Storage Manager for Complex Parallel Array Processing
(11) Distributed streaming
Integrating scale out and fault tolerance in stream processing using operator state management
Naiad: A Timely Dataflow System
(12) Databases on modern hardware
Hardware-Oblivious Parallelism for In-Memory Column-Stores
(13) Scheduling
Apache Hadoop YARN: Yet Another Resource Negotiator
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center