1 of 75

Human-Computer Interaction

saadh.info/hci

Week 9 (Thursday): Scientific Foundations of Quantitative HCI Research

1

2 of 75

Attendance and Agenda

  1. Scientific Foundations

2

3 of 75

Announcements

  • Please submit the teammate evaluation form. You will not see you grade on Canvas until we receive your evaluations for all the team members. Positive feedback is also welcome!
    • Based on the forms, I will reach out to some students to set up 15 minute meetings next week
  • Assignment 2 extended till next Monday. The assignment should take less than an hour to complete. Please no late submissions! I will apply the course policy of one point deduction per day.
  • Please go back and check if you finished all quizzes. There are some missing.
  • Only one missing assignment 1. Please submit it.
  • Please submit test 1 regrade requests by tomorrow.

3

4 of 75

Experimental Research in HCI

4

Scientific Foundation

Experiment Design

Hypothesis Testing

Demo and Assignment 3

5 of 75

What is Research?

  • “Research” means different things to different people
  • Often just a word adding weight to an assertion (“Our research shows that…”)
  • ISP television ad:1

  • Hmm... Is this research available for public scrutiny?
  • What about the independence of the research?
  • Research has at least three definitions (next slide)

5

“Independent research proves our Internet service is the fastest and most reliable – period.”

1 Aired in Ontario, winter of 2008/2009 (ad for Rogers Communications Inc.).

6 of 75

Research – Definition #1

  • Research is…

  • Examples
    • Searching one’s garden for weeds
    • Searching a computer to find all files modified on a certain date

6

Careful or diligent search

7 of 75

Research – Definition #2

  • Research is…

  • Examples
    • Survey voters to collect information on political opinions in advance of an election
    • Observe people using printers and collect information, such as the number of times they
      • Consulted the manual
      • Clicked the wrong button
      • Retried an operation
      • Uttered an expletive

7

Collecting information about a particular subject

8 of 75

Research – Definition #3

  • Research is…

  • What does that mean for HCI researchers?
    • Design and conduct a user study to test whether a new interaction technique improves on an existing interaction technique

8

Investigation or experimentation aimed at the discovery and interpretation of facts, the revision of accepted theories or laws in light of new facts.

9 of 75

Experimentation

  • A central activity in Quantitative HCI research
  • An experiment is sometimes called a user study
  • Formal, standardized methodology preferred
    • Brings consistency to a body of work
    • Facilitates reviews and comparisons between different user studies

9

10 of 75

Facts, Theories, Laws

  • Facts
    • Building blocks of evidence
    • Evidence is tested to confirm hypotheses (more later)
  • Theory
    • An hypothesis assumed for the sake of argument
    • A scientifically accepted body of principles that explain phenomena
  • Law
    • More constraining, more formal, more binding
    • A relationship that is invariable under given conditions
    • HCI involves humans, so laws are of questionable value

10

11 of 75

11

Let’s look at some other characteristics of research that are not encompassed in the definitions…

12 of 75

Research Must Be Published

  • Publication is the final step
  • Also an essential step
  • Publish or perish!
    • Edict for researchers in all fields, and particularly in academia
  • Until it is published, research cannot achieve its critical goal:
    • Extend, refine, or revise the existing body of knowledge in the field

12

13 of 75

Peer Review

  • Research submitted for publication is reviewed by peers – other researchers doing similar research
  • Only research meeting a high standard of scrutiny is accepted for publication
    • Are the results novel and useful?
    • Does the evidence support the conclusions?
    • Does the methodology meet the expected standards for the field?
  • Accepted research is published and archived
  • The final step is complete

13

14 of 75

Peer Review

14

15 of 75

Patents

  • Some research develops into bona fide inventions
  • A researcher/company may wish to maintain ownership of (profit from) the invention
  • Patenting is an option
  • The patent application describes
    • Previous related work
    • How the invention addresses a need
    • The best mode of implementation
  • If the application is granted, the patent is issued
  • Note: A patent is a publication; thus patenting meets the must-publish criterion for research

15

16 of 75

Citations, References, Impact

  • Citations, like hyperlinks, connect research to other research
  • Through citations, a body of research takes shape
  • The number of citations to a research paper is an indication of the paper’s impact
  • Can you spot the high-impact paper below? (arrows are citations)

16

17 of 75

Google Scholar – H-index1

  • Since the arrival of Google Scholar, citation counts are easy to gather
  • Can be gathered for papers, journals, etc.
  • Can also be gathered for researchers
  • H-index is a measure of the impact of a researcher
  • If a researcher’s publications are ranked by the number of citations, the H-index is the point where the rank equals the number of citations; i.e., a researcher with H-index = n has n publications each with n or more citations
  • A respectable H-index (although debatable) in HCI is “number of years since PhD since start of PhD

17

1 Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, 102, 16568-16572.

18 of 75

Research Must Be Reproducible

  • Research that cannot be replicated is useless
  • A high standard or reproducibility is essential
  • The research write-up must be sufficiently detailed to allow a skilled researcher to replicate the research if he/she desired
  • The easiest way to ensure reproducibility is to follow a standardized methodology
  • Many great advances in science pertain to methodology (e.g., Louis Pasteur’s detailed disclosure of the methodology used in his research in microbiology)
  • The most cited research paper is a “method paper”1 (see Google Scholar for the latest citation count)

18

1 Lowry, O. H., Rosenbrough, N. J., Farr, A. L., & Randall, R. J. (1951). Protein measurement with the Folin phenol reagent. Journal of Biological Chemistry, 193, 265-275.

19 of 75

Research vs. Engineering vs. Design

  • Researchers often work closely with engineers and designers, but the skills each brings are different
  • Engineers and designers are in the business of building things, bringing together the best in form (design emphasis) and function (engineering emphasis)
  • One can image that there is a certain tension, even trade-off, between form and function
  • Sometimes, things don’t go quite as planned 🡪

19

20 of 75

Function Trumpeting Form?

20

21 of 75

Form Trumpeting Function

  • The photo below shows part of a laptop computer
  • The form is elegant – smooth, shiny, metallic
  • The touchpad design (or is in engineering?) has a problem
  • No tactile sense at the sides of the touchpad

21

The fix

22 of 75

Duct Tape To The Rescue

22

A true story!

23 of 75

Research Milieu

  • Engineering and design are about products
  • (Quantitative) Research is not about products
    • Research is narrowly focused
    • Research questions are small in scope
    • Research is incremental, not monumental
    • Research ideas build on previous research ideas
  • Good ideas are refined, advanced (into new ideas)
  • Bad ideas are discarded, modified
  • Products come later, much later

23

24 of 75

Schematic

24

25 of 75

Example: Apple iPhone (2007)

25

iPhone Gestures:

  • Tilt
  • Multitouch
  • Flick

26 of 75

Tilt

  • Research on tilt as an interaction primitive dates at least to 19981

26

1 Harrison, B., Fishkin, K. P., Gujar, A., Mochon, C., & Want, R. (1998). Squeeze me, hold me, tilt me! An exploration of manipulative user interfaces. Proc CHI '98, 17-24, New York: ACM.

27 of 75

Multitouch

  • Research on multitouch as an interaction primitive dates at least to 19781

27

1 Herot, C. F., & Weinzapfel, G. (1978). One-point touch input of vector information for computer displays. Proceedings of SIGGRAPH 1978, 210-216, New York: ACM.

28 of 75

Flick

  • Research on flick as an interaction primitive dates at least to 19631

28

1 Sutherland, I. E. (1963). Sketchpad: A man-machine graphical communication system. Proceedings of the AFIPS Spring Joint Computer Conference, 329-346, New York: ACM.

29 of 75

So…

29

1963

(multitouch)

1978

(flick)

1998

(tilt)

2007

(iPhone)

Research

Engineering

Design

Materials�&�Processes

Products

time

30 of 75

Design as Research

  • Gaver and Bowers opine on the struggle for designers to also be researchers:1 (paraphrased)

30

    • Do we need to add research questions or methodological rigour to design practice for it to count as research?
    • Do we have to change design practices to make our contributions to HCI look more like research?
    • Is the result still design, or have we lost something in the process?
    • These questions have been vexing the HCI design community – and us – for some time. The problem is that novel products alone do not seem sufficient to count as research.

Photostroller

1 Gaver, B., & Bowers, J. (2012, July/August). Annotated portfolios. interactions, 40-49.

31 of 75

On Materials and Processes

  • Gaver and Bowers also comment on the materials and processes they use in design:

31

    • It was by looking at specific examples of practice that we found guidance for our work.

Research

Engineering

Design

Materials�&�Processes

Products

time

Products

“Examples

of practice”

32 of 75

Research Contributions in HCI

  1. Empirical
  2. Artifact
  3. Theoretical
  4. Methodological
  5. Dataset
  6. Survey
  7. Opinion

32

33 of 75

“Empirical” Research

  • Empirical:
    • Originating in or based on observation or experience
    • Relying on experience or observation alone without due regard for system or theory (i.e., don’t be blinded by pre-conceptions)

33

34 of 75

“Empirical” Research (2)

  • Empirical: (by another definition)
    • Capable of being verified or disproved by observation or experiment
  • HCI research
    • Framed by hypotheses
    • Methodology to test hypotheses
    • Experiments (aka user studies) are the vehicle
    • Hypotheses must be sufficiently narrow and clear to allow for verification or disproval (more on this later; see Research Questions)

34

35 of 75

Research Methods

  1. Observational method (Module 2)
  2. Experimental method
  3. Correlational method

35

36 of 75

Observational Method (Module 2)

  • Example methods:
    • Interviews, field investigations, contextual inquiries, case studies, field studies, focus groups, think aloud protocols, story telling, walkthroughs, cultural probes, etc.
  • Focus on qualitative assessments (cf. quantitative)
  • Relevance vs. precision
    • High in relevance (behaviours studied in a natural setting)
    • Low in precision (lacks control available in a laboratory)
  • Goal: discover and explain reasons underlying human behaviour (why or how, as opposed to what, where, or when)

36

37 of 75

Experimental Method

  • Aka scientific method
  • Controlled experiments conducted in lab setting
  • Relevance vs. precision
    • Low in relevance (artificial environment)
    • High in precision (extraneous behaviours easy to control)
  • At least two variables:
    • Manipulated variable (aka independent variable)
    • Response variable (aka dependent variable)
  • Cause-and-effect conclusions possible (changes in the manipulated variable caused changes in the response variable)

37

38 of 75

Correlational Method

  • Look for relationships between variables
  • Observations made, data collected
    • Example: are user’s privacy settings while social networking related to their age, gender, level of education, employment status, income, etc.
  • Non-experimental
    • Interviews, online surveys, questionnaires, etc.
  • Balance between relevance and precision (some quantification, observations not in lab)
  • Cause-and-effect conclusions not possible

38

39 of 75

Relationships: Circumstantial & Causal

  • As noted above…
    • Correlational methods 🡪 circumstantial relationships
    • Experimental methods 🡪 causal relationships
  • Causal-and-effect conclusions not possible if the independent variable is a naturally occurring attribute of participants (e.g., gender, personality type, handedness, first language, political viewpoint)
  • These attributes are legitimate independent variables
  • But, they cannot be assigned to participants; hence causal relationships not valid

39

40 of 75

Observe and Measure

  • Foundation of empirical research
  • Observation is the starting point; observations are made…
    • By the apparatus
    • By a human observer
  • Manual observation
    • Log sheet, notebooks
    • Screen capture, photographs, videos, etc.
  • Measurement
    • With measurement, anecdotes (April showers bring May flowers) turn to empirical evidence
    • When you cannot measure, your knowledge is of a meager and unsatisfactory kind” (Kelvin)

40

41 of 75

Classical Statistical Idea of Data

  • There are four classical types of data

41

Data Types

Categorical

Nominal

Ordinal

Numerical

Interval

Ratio

42 of 75

Scales of Measurement

42

43 of 75

Nominal Data

  • Nominal data (aka categorical data) are arbitrary codes assigned to attributes; e.g.,
    • 1 = male, 2 = female
    • 1 = mouse, 2 = touchpad, 3 = pointing stick
  • The code needn’t be a number; i.e.,
    • M = male, F = female
  • Obviously, the statistical mean cannot be computed on nominal data
  • Usually it is the count that is important
    • “Are females or males more likely to…”
    • “Do left handers or right handers have more difficulty with…”
    • Note: The count itself is a ratio-scale measurement
  • (example on next slide)

43

44 of 75

Nominal Data – HCI Example

  • Task: Observe students “on the move” on university campus
  • Code and count students by…
    • Gender (male, female)
    • Mobile phone usage (not using, using)

44

45 of 75

Ordinal Data

  • Ordinal data associate an order or rank to an attribute
  • The attribute is any characteristic or circumstance of interest; e.g.,
    • Users try three GPS systems for a period of time, then rank them: 1st, 2nd, 3rd choice
  • More sophisticated than nominal data
    • Comparisons of “greater than” or “less than” possible
  • (example on next slide)

45

46 of 75

Ordinal Data – HCI Example

46

47 of 75

Interval Data

  • Equal distances between adjacent values
  • But, no absolute zero
  • Classic example: temperature (°F, °C)
  • Statistical mean possible
    • E.g., the mean midday temperature during July
  • Ratios not possible
    • Cannot say 10 °C is twice 5 °C

47

48 of 75

Interval Data – HCI Example

  • Questionnaires often solicit a level of agreement to a statement
  • Responses on a Likert scale
  • Likert scale characteristics:
    1. Statement soliciting level of agreement
    2. Responses are symmetric about a neutral middle value
    3. Gradations between responses are equal (more-or-less)
  • Assuming “equal gradations”, the statistical mean is valid (and related statistical tests are possible)
  • Likert scale example 🡪 (next slide)

48

49 of 75

Interval Data – HCI Example (2)

49

50 of 75

Ratio Data

  • Most sophisticated of the four scales of measurement
  • Preferred scale of measurement
  • Absolute zero, therefore many calculations possible
  • Summaries and comparisons are strengthened
  • A “count” is a ratio-scale measurement
    • E.g., “time” (the number of seconds to complete a task)
  • Enhance counts by adding further ratios where possible
    • Facilitates comparisons
    • Example – a 10-word phrase was entered in 30 seconds
      • Bad: t = 30 seconds
      • Good: Entry rate = 10 / 0.5 = 20 wpm

50

51 of 75

Ratio Data – HCI Example1

51

-19%

+25%

1 MacKenzie, I. S., & Isokoski, P. (2008). Fitts' throughput and the speed-accuracy tradeoff. Proc CHI 2008, 1633-1636, New York: ACM.

52 of 75

General Rules

52

Thanks to GraphPad

OK to compute....

Nominal

Ordinal

Interval

Ratio

frequency distribution

Yes

Yes

Yes

Yes

median and percentiles

No

Yes

Yes

Yes

add or subtract

No

No

Yes

Yes

mean, standard deviation, standard error of the mean

No

No

Yes

Yes

ratio, or coefficient of variation

No

No

No

Yes

?

?

?

?

53 of 75

General Rules

53

OK to compute....

Nominal

Ordinal

Interval

Ratio

frequency distribution

Yes

Yes

Yes

Yes

median and percentiles

No

Yes

Yes

Yes

add or subtract

No

No

Yes

Yes

mean, standard deviation, standard error of the mean

No

No

Yes

Yes

ratio, or coefficient of variation

No

No

No

Yes

?

?

?

?

54 of 75

General Rules

54

OK to compute....

Nominal

Ordinal

Interval

Ratio

frequency distribution

Yes

Yes

Yes

Yes

median and percentiles

No

Yes

Yes

Yes

addition or subtraction

No

No

Yes

Yes

mean, standard deviation, standard error of the mean

No

No

Yes

Yes

ratio, or coefficient of variation

No

No

No

Yes

?

?

?

?

55 of 75

General Rules

55

OK to compute....

Nominal

Ordinal

Interval

Ratio

frequency distribution

Yes

Yes

Yes

Yes

median and percentiles

No

Yes

Yes

Yes

addition or subtraction

No

No

Yes

Yes

mean or standard deviation

No

No

Yes

Yes

ratio, or coefficient of variation

No

No

No

Yes

?

?

?

?

56 of 75

General Rules

56

OK to compute....

Nominal

Ordinal

Interval

Ratio

frequency distribution

Yes

Yes

Yes

Yes

median and percentiles

No

Yes

Yes

Yes

addition or subtraction

No

No

Yes

Yes

mean or standard deviation

No

No

Yes

Yes

ratio, or coefficient of variation

No

No

No

Yes

?

?

?

?

57 of 75

General Rules

57

OK to compute....

Nominal

Ordinal

Interval

Ratio

frequency distribution

Yes

Yes

Yes

Yes

median and percentiles

No

Yes

Yes

Yes

addition or subtraction

No

No

Yes

Yes

mean or standard deviation

No

No

Yes

Yes

ratio, or coefficient of variation

No

No

No

Yes

58 of 75

Research Questions

  • We conduct empirical research to answer�(and raise!) questions about UI designs or interaction techniques
  • Consider the following questions:
    • Is it viable?
    • Is it better than current practice?
    • Which design alternative is best?
    • What are the performance limits?
    • What are the weaknesses?
    • Does it work well for novices?
    • How much practice is required?

58

59 of 75

Testable Research Questions

  • Preceding questions, while unquestionably relevant, are not testable
  • Try to re-cast as testable questions (even though the new question may appear less important)
  • Scenario…
    • Group 2 has invented a new touchless text entry technique, and they think it’s pretty good. In fact, they think it is better than the Qwerty keyboard (QSK). They decide to undertake a program of empirical enquiry to evaluate their invention. What is a good research questions?

59

60 of 75

Research Questions (2)

  • Very weak� Is the new technique any good?
  • Weak� Is the new technique better than QSK?
  • Better� Is the new technique faster than QSK?
  • Better still� Is the measured entry speed (in words� per minute) higher for the new technique� than for QSK after one hour of use?

60

61 of 75

A Tradeoff

61

62 of 75

Internal Validity

  • Definition:
    • The extent to which the effects observed are due to the test conditions (e.g., multitap vs. new)
  • Statistically, this means…
    • Differences (in the means) are due to inherent properties of the test conditions
    • Variances are due to participant differences�(“pre-dispositions”)
    • Other potential sources of variance are controlled or exist equally or randomly across the test conditions

62

63 of 75

External Validity

  • Definition:
    • The extent to which results are generalizable to other people and other situations
  • People
    • The participants are representative of the broader intended population of users
  • Situations
    • The test environment and experimental procedures are representative of real world situations where the interface or technique will be used

63

64 of 75

Test Environment Example

  • Scenario…
    • You wish to compare two input devices for remote pointing (e.g., at a projection screen)
  • External validity is improved if the test environment mimics expected usage
  • Test environment should probably…
    • Use a large display or projection screen (not a desktop monitor)
    • Position participants at a significant distance from screen (rather than close up)
    • Have participants stand (rather than sit)
    • Include an audience!
  • But… is internal validity compromised?

64

65 of 75

Experimental Procedure Example

  • Scenario…
    • You wish to compare two text entry techniques for mobile devices
  • External validity is improved if the experimental procedure mimics expected usage
  • Test procedure should probably have participants…
    • Enter personalized paragraphs of text (e.g., a paragraph about a favorite movie)
    • Edit and correct mistakes as they normally would
  • But… is internal validity compromised?

65

66 of 75

The Tradeoff

  • There is tension between internal and external validity
  • The more the test environment and experimental procedures are “relaxed” (to mimic real-world situations), the more the experiment is susceptible to uncontrolled sources of variation, such as pondering, distractions, fiddling, or secondary tasks

66

67 of 75

Comparative Evaluations

  • Preferable to do a comparative evaluation rather than one-of
  • More insightful results obtained
  • Factorial experiments require comparison, because there must be at least one independent variable with at least two levels
  • If one condition is a base line; comparisons possible between studies (assuming similar methodology)

67

68 of 75

Research Topics

  • Finding a research topic is a challenge (for students… and for seasoned researchers too!)

  • Four tips:

1. Think small 2. Replicate

3. Know the literature 4. Think inside the box

68

69 of 75

Tip #1 - Think Small

  • Looking for that big idea?
  • Advice: Forget it (besides, it isn’t necessary)
  • Research questions are small, narrowly focused
  • Pursue several small, related research topics and before you know it, a dissertation topic is formed

69

70 of 75

Tip #2 - Replicate

  • Seems odd: where’s the research in simply replicating what was done before?
  • Of course, there is no research in replication, but the trick is in the path to replicating
  • Replicating prior research is a lot of work
  • Along the way, you will undoubtedly discover small and novel improvements – things to try
  • A little tweak here, a small modification there
  • BTW, you might not find a novel idea until well into the process (perhaps afterward!)

70

71 of 75

Tip #3 – Know The Literature

  • Whatever topic interests you, read the literature
  • E.g., social networking, gaming
  • If too broad, narrow (e.g., privacy settings in social networking, avatars in gaming)
  • Read papers, open a spreadsheet, tabulate variables in the methodology and the findings
  • Chaotic at first, order and shape will emerge (eventually)
  • With some luck (and further study) a research topics will emerge

71

72 of 75

Example Table1

72

1 MacKenzie, I. S. (2009). The one-key challenge: Searching for an efficient one-key text entry method. Proc ASSETS 2009, 91-98, New York: ACM.

73 of 75

Tip #4 – Think Inside The Box

  • Think outside the box 🡪 dispense with accepted beliefs and assumptions (in the box), and think in a way that assumes nothing and challenges everything
  • No need
  • Think inside the box: just get on with your day; but at every juncture, every interaction, think and question
  • Our everyday foibles are fertile ground for research topics

73

74 of 75

Tip #5 – Avoid Over-stimulus

74

75 of 75

Attendance & Next Time

  • Experiment Design

75