Stanford InfoLab InfoQual 2017

InfoQual consists of an oral exam where a committee of three faculty asks questions on five different topics. The oral exam requires passing all five topics. The topics from past exams are listed below; more topics can be added, subject to approval of the InfoLab faculty. The oral exam takes 1.5 hours and covers all five chosen topics. It is conducted by a committee of 3 faculty (InfoLab or otherwise). Details of the oral exam -- faculty involved, scheduling, and topics -- are coordinated by the student and student's advisor. The student is expected to be able to explain and critique in detail the readings for each topic, including the premises, contributions, impact, and shortcomings of the work. The outcome of the entire oral exam is a simple pass or fail; it is not a score, and it is not possible to pass some topics and fail others.

If you have any questions about the InfoQual talk to Jure Leskovec (jure@cs.stanford.edu).

How to register for the InfoQual

In the table below please fill in the time, exam location, the topics and a professor for each of the 5 oral topics. Students check with the committee members about their availability and topics they will quiz them on. Students also need to book a room for the oral exam.

Student, time, and room

Oral topic 1

Oral topic 2

Oral topic 3

Oral topic 4

Oral topic 5

John Emmons, 10-11am on June 5-6, rooms 498 and 459

Streaming Analytics Engines (matei)

Parallel Model Training (peter)

Model Serving and Fast Inference (peter)

Networking (keith)

Data Compression (keith)

Daniel Kang (June 13, 2PM)

Model Serving and Fast Inference (Peter)

Parallel Model Training (Peter)

Design of Parallel Dataflow Engines (Matei)

MapReduce, Pig, Hive (Hector)

Interactive Data Analysis Systems (Matei)

Kexin Rong

(Jun 9, 10:30AM)

Finding similar items (Jeff)

Frequent Itemsets (Jeff)

Interactive Data Analysis Systems (Hector)

Parallel Model Training (Peter)

Model Serving and Fast Inference (Peter)

Edward Gan

(Jun 9, 9:00AM)

Finding similar items (Jeff)

Interactive Data Analysis Systems (Hector)

Model Serving and Fast Inference (Peter)

Sketching and synopses

(Peter)

Design of Parallel Dataflow Engines

(Peter)

Sen Wu

(Jun 13, 4:00PM, Gates 200)

Finding similar items (Jeff)

Frequent Itemsets (Jeff)

Web Search (Hector)

Parallel Database Systems (Hector)

Link Analysis (Jure)

EXAMPLE:

John
(May 22, 9:00AM)

Similar Items (Jeff)

Power Laws and Preferential Attachment  (Jure)

Recommendation Algorithms (Ashish)

Theory of MapReduce (Jeff)

MapReduce, Pig, Hive (Hector)

Topics for the oral exam

These topics were used by previous students who took the InfoQual. Adding/customizing topics is easy. If students want to do that, students have to find a professor who will prepare and approve the list of papers and then quiz the student on it.

Web Search  (Approved  by Hector)

Small-world networks and Decentralized search (Approved by Jure)

Power-laws and Preferential attachment (Approved by Jure)

Link Analysis (Approved by Jure)

Community structure in networks (Approved by Jure)

Cascading Behavior in Networks (Approved by Jure)

Frequent Itemsets (Approved by Jeff)

Finding similar items (Approved by Jeff)

Distributed Graph Computation Systems  -- Jennifer, Peter, Matei

MapReduce, Pig, Hive  -- Hector or Jennifer

Parallel Database Systems  -- Hector, Peter, Matei

Theory of MapReduce  -- Jeff

Conjunctive Query Containment  -- Jeff

Graph Query Languages and Data Models – Jennifer

Visual Data Analytics – Andreas

Interactive Data Analysis Systems – Hector, Matei, Peter

Crowdsourcing Algorithms  -- Hector

Crowdsourcing Systems  -- Jennifer

Peer to Peer  -- Hector, Matei, Peter

Distributed Transactions -- Peter, Matei

Streaming Analytics Engines -- Peter, Matei

Parallel Model Training -- Peter, Matei

Resource-Constrained Query Processing -- Peter, Matei

Sketching and Synopses -- Peter, Matei

Model Serving and Fast Inference -- Peter, Matei

Design of Parallel Dataflow Engines -- Peter, Matei

Replication, Consensus, Distributed Agreement -- Peter, Matei

Internet + Information Integration -- Peter, Matei

Languages for Analytics -- Peter, Matei

Materialized Views -- Peter, Matei


Additional non-core InfoLab topics

Networking

Quizzer: Keith Winstein

Quizzees: John Emmons

Data Compression

Quizzer: Keith Winstein

Quizzees: John Emmons

Inference surrounding linear regression models (approved by Lester Mackey)

Quizzer: Lester Mackey

Quizzees: Hima

Feature selection and variable importance (approved by Lester Mackey)

Quizzer: Lester Mackey

Quizzees: Hima

Sentiment Analysis (approved by Dan Jurafsky)

Quizzer: Dan Jurafsky

Quizzees: Bob, Hima

- Bing Liu. "Sentiment Analysis and Subjectivity." In the Handbook of

Natural Language Processing, Second Edition. March, 2010.

http://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf

- Danescu-Niculescu-Mizil, Cristian, Gueorgi Kossinets, Jon Kleinberg, Lillian Lee. 2009. How opinions are received by online communities: A case study on Amazon.com helpfulness votes. Proceedings of WWW, 141-150.

- Pang, Bo, Lee, Lillian, and Vaithyanathan, Shivakumar. 2002. Thumbs up? Sentiment classification using machine learning techniques. EMNLP 2002.

Information Extraction (approved by Dan Jurafsky)

Quizzer: Dan Jurafsky

Quizzees: Bob

- IE chapter in Dan J’s NLP book

- Banko, M. and Etzioni, O. The tradeoffs between traditional and open relation extraction. In ACL 2008.

- Weld, D., Wu, F., Adar, E., Amershi, S., Fogarty, J., Hoffmann, R., Patel, K. and Skinner, M. Intelligence in Wikipedia. In AAAI 2008.

Social Choice Theory (Approved by Ashish Goel)

Quizzer: Ashis Goel

Quizzees: Christie, Peter

-- Y. Shoham and K. Leyton-Brown, Aggregating Preferences: Social Choice, Chapter in Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations.

-- A. Altman and M. Tennenholtz,  Ranking Systems: The PageRank Axioms (EC 2005).

-- K. Arrow, A Difficulty in the Concept of Social Welfare (Journal of Political Economy 1950). 

Network Formation (approved by Ashish Goel)

Quizzee: Christie

--Venkatesh Bala and Sanjeev Goyal.  A Noncooperative Model of Network Formation. Econometrica , Vol. 68, No. 5 (Sep., 2000), pp. 1181-1229          http://www.jstor.org/stable/2999447

--M. Jackson and Brian W. Rogers. "Search and the strategic formation of large networks: when and why do we see power laws and small worlds." Proceedings of the Second Workshop on the Economics of Peer-to-Peer Systems (Cambridge, MA. 2004.)

--Jackson, Matthew O., and Brian W. Rogers. "Meeting strangers and friends of friends: How random are social networks?." The American economic review (2007): 890-915.

--Dandekar, Pranav, et al. "Strategic formation of credit networks." Proceedings of the 21st international conference on World Wide Web. ACM, 2012.

Regularization (Approved by Mohsen Bayati)

Quizzer: Mohsen Bayati

Quizzees: Chenguang

-- Review: Trevor Hastie, Robert Tibshirani, Jerome Friedman. Chapter 3 of The Elements of Statistical Learning.

-- Review 2: Friedman,  J. H., Hastie, T. and Tibshirani, R. Regularized Paths for Generalized Linear Models via Coordinate Descent, Journal of Statistical Software, 33(1) (2008).

Review: F. Girosi, M. Jones and T. Poggio, Regularization Theory and Neural Networks Architectures, In Neural Computation, pp. 219-269, 1995

-- M. W. Mahoney and L. Orecchia, Implementing Regularization Implicitly Via Approximate Eigenvector Computation, In Proc. ICML 2011

-- P. Perry and M. Mahoney, Regularized Laplacian Estimation and Fast Eigenvector Approximation, In Proc. NIPS 2011

Social Choice Theory (Approved by Yoav)

Quizzer: Yoav Shoham

Quizzees: Ashton

-- Y. Shoham and K. Leyton-Brown, Aggregating Preferences: Social Choice, Chapter in Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations.

-- A. Altman and M. Tennenholtz,  Ranking Systems: The PageRank Axioms (EC 2005).

-- K. Arrow, A Difficulty in the Concept of Social Welfare (Journal of Political Economy 1950).

Mechanism Design (Approved by Yoav)

Quizzer: Yoav Shoham

Quizzees: Ashton

-- Y. Shoham and K. Leyton-Brown, Protocols for Strategic Agents: Mechanism Design, Chapter in Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations.

-- T. Roughgarden, Lecture Notes on Combinatorial Auctions.

Graph Layout and Network Visualization (Jeff Heer):

Survey: Herman, et al. Graph Visualization and Navigation in Information Visualization: A Survey. IEEE TVCG 2000.

Articles:

Text Visualization (Jeff Heer):

Survey: Hearst. Information Visualization for Text Analysis. Search User Interfaces (Chap. 11).

Articles:

Articles:

Crowds and Human Computation (Scott):

Survey:

Articles:

Alternate:

Social Systems (Scott):

Survey: Some chapter from: Kraut & Resnick. Building Successful Online Communities: Evidence-Based Social Design. (In Press).

Articles:

Alternate:

Social Computing (Michael):

Social Systems (Michael):



Info Qual 2012

Quizzee

Topic 1

Topic 2

Topic 3

Topic 4

Topic 5

Ashton

(May 31 10:30AM-12PM)  GATES 392

Decentralized search (Jure)

Power-laws (Jure)

Social Choice Theory (Yoav)

Mechanism Design (Yoav)

Link analysis (Jeff)

Bob (May 31, 2:00PM) Gates 459

Decentralized search (Jure)

Power-laws (Jure)

Sentiment analysis (Dan)

Information Extraction (Dan)

Link analysis (Jeff)

Sanjay (June 4, 3:15PM) Gates 459

Decentralized Search (Jure)

Text Viz (Jeff H)

Network Viz (Jeff H)

Crowds/Human Computation (Scott)

Social Systems (Scott)

Chenguang (June 1, 4:00-5:00PM, Gates 459)

Link analysis (Jure)

Cascading behavior (Jure)

Frequent Itemsets (Jeff)

Finding Similar Items (Jeff)

Regularization (Mohsen)

Semih (10am June 6)

Gates 434

Distributed Graph Computation Systems  ( Jennifer)

MapReduce, Pig, Hive  -- (Jennifer)

Parallel Database Systems (Hector)

Conjunctive Query Containment (Jeff U.)

Theory of MapReduce  (Jeff U.)

Stephen ( June 12 at 2:30pm)

Gates 434

Crowdsourcing algorithms (Hector)

Crowsourcing systems (Jennifer)

MapReduce, Pig, Hive  -- (Jennifer)

Peer to Peer (Hector)

Search (Hector)

Info Qual 2013

Quizzee

Topic 1

Topic 2

Topic 3

Topic 4

Topic 5

Christie

(May 22 1:30PM)

Link Analysis (Jeff)

Frequent Itemsets (Jeff)

Cascading behavior in networks (Jure)

Social Choice Theory (Ashish)

Network Formation (Ashish)

Jaeho
(May 24 3-4:30PM)

MapReduce, Pig, BigTable

(Hector & Jennifer)

Distributed Graph Computation Systems (+GraphLab, -Pegasus)

(Jennifer)

Graph Query Languages and Data Models

(Jennifer)

Visual Data Analytics

(Andreas)

Interactive Data Analysis Systems

(Hector)

Saint

(May 22 10AM)

Link Analysis (Jure)

Frequent Itemsets (Jeff)

Similar Items (Jeff)

Crowdsourcing Algorithms (Hector)

Web Search (Hector)

Manas (May 22 2:30PM)

Crowdsourcing Algorithms (Hector)

Frequent Itemsets (Jeff)

Similar Items (Jeff)

Decentralized Search in Small World Networks (Jure)

Crowdsourcing Systems (Hector)

Stephen (May 29 2:30PM)

Crowdsourcing Algorithms (Hector)

Crowdsourcing Systems (Jennifer)

Link Analysis (Jure)

Peer to Peer (Hector)

Search (Hector)

Peter (May 22, 9:00AM)

Similar Items (Jure)

Power Laws and Preferential Attachment  (Jure)

Recommendation Algorithms (Ashish)

Theory of MapReduce (Jeff)

MapReduce, Pig, Hive (Hector)

InfoQual 2014:

Quizzee, time, and room

Written exam topics

Oral topic 1

Oral topic 2

Oral topic 3

Oral topic 4

Oral topic 5

Justin (Jun 6, 2.30PM)

Social networks,

Data mining

Strength of Weak Ties (Jure)

Cascading Behavior (Jure)

Crowd-

sourcing (Hector)

Social Computing (Michael)

Social Systems (Michael)

Vasilis (May 8, 1:30PM)

Database Systems,

Network Analysis

Cascading behavior (Jure)

Decentralized Search (Jure)

Similar Items (Jeff)

Crowdsourcing Algorithms (Hector)

Peer to Peer (Hector)

Akash (May 8, 9.30 AM)

Database Systems, Databases

Crowdsourcing Algorithms (Hector)

Crowdsourcing Systems (Jennifer)

Fuzzy Joins using MapReduce (Jennifer)

Decentralized Search (Jure)

Similar Items (Jure)