1 of 64

Surveys in the Digital Age

SICSS Day 4

Adapted from in part from Lectures by Matt Salganik

2 of 64

Announcements

  • Last day to sign-up for lightning talks
  • Daily logistics

3 of 64

The Plan for Today

  • A (brief) introduction to digital surveys & tutorial of tools
  • Group work: develop a survey & launch it
  • Research talk & lunch with Dror Walter
  • Beyond traditional surveys
  • Group work: data analysis & visualization creation

4 of 64

Part 1

5 of 64

Are Surveys Really Computational Social Science?

6 of 64

Anything cool = computational social science�Surveys = cool��Surveys = computational social science

7 of 64

8 of 64

Surveys as a (partial) solution for the limitations of big data

  • Limitations of big data
  • Internal states vs. external states
  • Inaccessibility of big data

9 of 64

But we are going to change the way we do surveys

10 of 64

11 of 64

12 of 64

13 of 64

14 of 64

15 of 64

Good!

Bad!

16 of 64

But is that delineation really that stark?

17 of 64

But is that delineation really that stark?

  • HUGE rates of non-response >90%
  • Systematic missingness or non-missingness
  • How feasible are probability samples actually?

18 of 64

Good!

Bad!

It’s Complicated

19 of 64

Probability sample

  • Every unit from a population sample has a non-zero probability of inclusion & we know what that probability is
  • Not ALWAYS a mini-version of the population (opt for over sampling etc.)
  • Using various weights, you can recover unbiased estimates of the frame population

20 of 64

21 of 64

22 of 64

Not that different from non-probability sampling?

23 of 64

24 of 64

Digital Survey Platforms

  • Digital platforms specifically for research and digital tasks where large numbers of workers sign-up to take surveys
    • Fill up fast
    • You CAN pay very little
    • Data quality can be more mixed

25 of 64

MTurk

26 of 64

MTurk

27 of 64

Prolific

  • https://www.prolific.co/

28 of 64

29 of 64

30 of 64

31 of 64

32 of 64

33 of 64

34 of 64

35 of 64

36 of 64

Download on board demographics

  • Includes: age, time taken, number of lifetime approvals, number of rejects, their prolific score, country of birth, current country of residence, employment status, first language, nationality, sex, student status...where available!

37 of 64

Download on-board demographics

38 of 64

Qualtrics ��https://technology.gsu.edu/technology-services/productivity-collaboration/qualtrics/

39 of 64

40 of 64

41 of 64

42 of 64

43 of 64

44 of 64

Specific Notes on Digital Question Creation

  • Professional survey takers
    • FAST
    • Less likely to find questions invasive – perhaps increasing your ethical obligations (consent)

  • Intentionally preserving data quality
    • Botting, opening the survey and submitting it empty
    • Attention checks
    • Timing

  • Consideration of how digital survey workers differ from the general population

45 of 64

Part 2

46 of 64

Best Practices in Question Creation & Digital Implementation

47 of 64

General Notes on Question Creation

  • Question wording matters
    • Measure what you think you’re measuring

  • Ex: Measuring fear of crime
    • How safe do you feel….
    • How worried are you about…

48 of 64

General Notes on Question Creation

  • Question wording matters
    • Measure what you think you’re measuring
    • Think about how decisions about aggregation will affect your analysis

  • Providing many response categories can be very inclusive…which is great!
  • But what if you end up dropping everyone who gave that answer from your regressions due to ‘lack of power’?

49 of 64

General Notes on Question Creation

  • Question wording matters
    • Measure what you think you’re measuring
    • Think about how decisions about aggregation will affect your analysis
    • There are some questions people don’t like to answer

  • Income, criminal activity etc.

50 of 64

General Notes on Question Creation

  • Question wording matters
    • Measure what you think you’re measuring
    • Think about how decisions about aggregation will affect your analysis
    • There are some questions people don’t like to answer
    • There are also some questions where people are less likely to answer accurately

  • Social desirability (ex: voting)
  • Recall

51 of 64

General Notes on Question Creation

  • Question wording matters
    • Measure what you think you’re measuring
    • Think about how decisions about aggregation will affect your analysis
    • There are some questions people don’t like to answer
    • There are also some questions where people are less likely to answer honestly

  • Open-ended questions = disaster?
  • Creating a very clean document (using branching logic or skip logic) can be extremely helpful

52 of 64

Are Digital Survey Platforms…good?

  • The Good
    • Have successfully replicated a number of field experiments on mTurk
    • Better than college students?

  • The Bad
    • Super-Turkers vs. Spammers
      • Add noise in opposing directions
    • Distracted
    • Differ substantially from gen pop
      • Over-educated, under-employed, more liberal (Berinsky et al. 2012; Paolacci and Chandler 2014; Paolacci et al. 2010)

53 of 64

How Do We Solve This Problem

  • Front-end
    • Smarter survey design & sampling
  • Back-end
    • Post-estimation weighting

54 of 64

Post-Estimation Weighting

  • Non-response adjustment
  • Post-stratification
  • Calibration

55 of 64

Post-Estimation Weighting

  • Non-response adjustment
    • Achieve alignment between responding sample and original sample
      • Break sample into cells and define a non-response adjustment factor
      • Model response propensity

56 of 64

Post-Estimation Weighting

  • Non-response adjustment
  • Post-stratification
    • Weights adjusted so the weighted totals within mutually exclusive cells equals known population totals
    • Need to know the characteristics of every unit in the sampling frame before sampling

57 of 64

Post-Estimation Weighting

  • Non-response adjustment
  • Post-stratification
  • Calibration
    • Weights made to agree with known totals for each margin
      • Ex: the weighted totals in the groups defined by race are equal to the known population totals; the weighted totals in the groups defined by gender are equal to the known population totals; etc
      • Not guaranteed by interactions (ex: race by age) but are probably close

58 of 64

59 of 64

Types of Digital-Era Surveys

  • Questions about individual behaviors, opinions, and experiences
  • Survey participants as coders
  • Vignette studies
  • Survey experiments

60 of 64

Behaviors, opinions, and experiences

  • Census
  • General Social Survey
  • Kat’s incredible horror movie survey

61 of 64

Survey participants as coders�

  • Surveys as creative data collection methodology

  • Title IX policy legibility study (Albrecht and Nielsen)
    • Hypothesis: Title IX Policies are not comprehensible to the people they are meant to protect
    • Methodology: Survey project asking participants to code a policy and answer questions on their experience doing so
    • Analysis: Construct inter-rater reliability scores across all coders to see if they could agree on the contents of Title Policies
    • Spoiler: they don’t

62 of 64

MTurk

63 of 64

Vignette Studies

  • Showing participants vignettes (often as text or video) and asking them to make decisions based on that information

  • Morality and Vehicular Manslaughter
    • 3 vignettes to represent news stories about a hypothetical crime
    • Participants make judgements about deserved levels of punishment based on these carefully constructed vignettes

64 of 64

Survey Experiments

  • Literally an experiment embedded within a survey

  • Estimating the effect of fear rhetoric on approval of a new law
    • Matrix of conditions
    • Introduction of an experimental stimulus
    • Random assignment