HComp Reading Seminar @ UMD/HCIL

Reading Seminar in Human Computation (CMSC 898B)

Human-Computer Interaction Lab / Computer Science Department

University of Maryland

Fall 2010

1 credit

Tuesdays, 2-3pm

HCIL Conference Room - 2116 Hornbake Library, South Wing

The fields of natural language processing, computer vision, and artificial intelligence have an important characteristic in common: All seek to automate tasks that humans do naturally so that they can be done more quickly and in greater quantity. Whether it be recognizing the faces of missing children in airports, translating documents between languages, or summarizing the opinions of Iranian blogs, it would be immensely beneficial to society if these problems could be solved quickly, accurately, and cheaply at the push of a button. Unfortunately, neither automated nor manual solutions are simultaneously fast, accurate, and cheap. Automated solutions are fast and cheap, but typically have only modest quality. Paid human work is far more expensive, and slower than computers for the same amount of work. Volunteer human work is typically even slower and not as good, but of course less expensive.

The scale of computational problems that our society regularly needs to solve continues to grow rapidly. The move to cloud computing makes sense since hardware for running automated algorithms can be centrally managed and rapidly provisioned on demand. Current cloud-based computing platforms such as MapReduce are fast and cost-effective for processing enormous quantities of data, but the quality is dependent on the underlying algorithms. Fully-automated solutions often yield quality that varies based on the problem domain, the particular dataset being analyzed, and how much training data is available to inform the automated algorithms.

Human Computation (HComp) - sometimes called collective intelligence, or crowdsourcing is the strategy of combining the strengths of computers and humans by assigning small, independent tasks to a large number of human contributors connected by the internet. In contrast to human-managed online cooperation, HComp is used to solve computational problems - things we would like a computer to solve - with a computer-managed process that incorporates human contributors. Current marketplace-based approaches to HComp, such as Amazon Mechanical Turk, effectively demonstrate the value of using humans as computational units, but are slower and less cost-effective than fully-automated solutions. While scalable and efficient, there are complex issues affecting the level of quality that can be achieved.

In this seminar, we will look at current research and commercial activities in HComp with the goal of understanding what the issues are, and what the potential is, and where there are open and interesting research problems. No prior experience is required to attend, and while some technology will be discussed, graduate students from any discipline are welcome (to a maximum of 20 students).

Each week, we will discuss one or two papers, websites or services from many venues, such as CHI, and HCOMP. We will rotate responsibility for leading discussion, and we will collaboratively decide what to read after I introduce a few key ideas. Note that discussion leaders must not simply summarize the papers - we assume that all attendees have read the papers. Instead, the point is to critically analyze the work.

How to Read Research Papers

When you read a research paper, your goal should be to try and understand its contributions and limitations. Read it critically, remembering that a person wrote this that has a particular reason for writing the paper. They may be trying to push a particular approach, or technology. Even if they are trying to be as balanced possible, they are still likely to introduce biases. So, I have developed a short list of questions I try to keep in mind whenever I read a paper. We’ll try to ask these questions when we discuss the papers in class.

What is the problem (specifically what does it solve)?
What assumptions are made?
Who are the intended users of the research?
Have those users been involved in the design or evaluation of the work (i.e., is the solution usable?)
Are there unanswered questions?
Is the solution scalable (how much data does it work with)?
Is the solution generalizable (does the solution work in other domains)?

Schedule

The first few weeks are pre-planned. The rest will be crowdsourced by us (and anyone else). Please suggest papers and vote on papers that interest you.

1: Aug 31 (Ben Bederson): General introduction to each other and topic

2: Sept 7 (Ben Bederson)

Thomas W. Malone, Robert Laubacher, Chrysanthos Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence. CCI Working Paper 2009-001.

Quinn, A., Bederson, B. (October 2009), A Taxonomy of Distributed Human Computation, HCIL-2009-23

3: Sept 14: NO CLASS

4: Sept 21: (Ben Bederson)

von Ahn, L. and Dabbish, L. 2008. Designing games with a purpose. Commun. ACM 51, 8 (Aug. 2008), 58-67.

Play some games at www.gwap.com

Look at these services: CrowdFlower, CloudCrowd

5: Sept 28: (Alex Quinn, Ben won’t be here)

Kittur, A., Chi, E. H., and Suh, B. 2008. Crowdsourcing user studies with Mechanical Turk. CHI 2008. http://doi.acm.org/10.1145/1357054.1357127

6: Oct 5: (Chang Hu)

"Ross, J., Irani, L., Silberman, M. S., Zaldivar, A., and Tomlinson, B. 2010. Who are the crowdworkers?: shifting demographics in mechanical turk. CHI 2010 http://doi.acm.org/10.1145/1753846.1753873

What is current state of AMT in all countries? Where you can get currency?

What is the state of virtual currency?

7: Oct 12:

Little, G., Chilton, L., Goldman, M. & Miller, R. (Echo)

“Exploring Iterative and Parallel Human Computation Processes”. HComp 2010

http://glittle.org/Papers/HCOMP2010.pdf

Berstein, M. et al. (Ran)

“Soylent: a word processor with a crowd inside”. UIST 2010

http://doi.acm.org/10.1145/1866029.1866078

8: Oct 19:

Viz Whiz (Tom)

VizWiz: nearly real-time answers to visual questions