1 of 38

Can Voters Detect Malicious Manipulation of Ballot Marking Devices?

Matthew Bernhard, Allison McDonald, Henry Meng, Jensen Hwa,

Nakul Bajaj, Kevin Chang, and J. Alex Halderman

2 of 38

What’s a ballot marking device (BMD)?

2

3 of 38

Where are BMDs used?

3

Source: Verified Voting

States using BMDs in 2020

4 of 38

BMD Security Assumptions

4

Select

Print

Review

Scan

Audit

5 of 38

BMD Security Assumptions

Attacking the Scanner

5

6 of 38

BMD Security Assumptions

Attacking the Scanner

6

Post-election audit will catch malicious scanner!

7 of 38

BMD Security Assumptions

Attacking the BMD

7

8 of 38

BMD Security Assumptions

Attacking the BMD

8

Post-election audit will confirm wrong result!

9 of 38

BMD Security Assumptions

Attacking the BMD

9

Voter inspection will catch malicious BMD/printer!

10 of 38

BMD Security Assumptions

Attacking the BMD

10

Voter inspection will catch malicious BMD/printer?

11 of 38

BMDs in Context

Stark speculated that voters are likely not to check their ballots on their own [1], and prior work about other types of voting equipment supports this [2,3].

Prior work on users’ response to warnings in other settings like phishing [4] and certificate warnings [5] suggest that if voters do not check their BMD ballots, well-designed warnings may help.

11

[1] Stark, P. B. "There is no reliable way to detect hacked ballot-marking devices, 2019." (1908).

[2] Acemyan, Claudia Ziegler, Philip Kortum, and David Payne. "Do voters really fail to detect changes to their ballots? An investigation of ballot type on voter error detection." In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 57, no. 1, pp. 1405-1409. Sage CA: Los Angeles, CA: SAGE Publications, 2013.

[3] Selker, Ted, Elizabeth Rosenzweig, and Anna Pandolfo. "A methodology for testing voting systems." Journal of usability studies 2, no. 1 (2006): 7-21.

[4] Egelman, Serge, Lorrie Faith Cranor, and Jason Hong. "You've been warned: an empirical study of the effectiveness of web browser phishing warnings." In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1065-1074. 2008.

[5] Akhawe, Devdatta, and Adrienne Porter Felt. "Alice in warningland: A large-scale field study of browser security warning effectiveness." In Presented as part of the 22nd {USENIX} Security Symposium ({USENIX} Security 13), pp. 257-272. 2013.

12 of 38

Research Questions

Q1:

Can voters detect errors introduced to the paper ballot?

Q2:

Are there interventions that can improve the rate at which people detect?

13 of 38

Other research questions

  • Does ballot style matter?
  • Does candidate position on the ballot matter?
  • Does type of manipulation matter?
  • Are voters sensitive to the content of the warnings?

13

14 of 38

Study Design

Worked with AADL to setup mock polling place environment

14

15 of 38

Study Design

Built custom BMD out of older voting machines

15

16 of 38

Study Design

Used truncated version of Ann Arbor 2018 midterm ballot

16

17 of 38

Study Design

On every ballot, one of the participants’ choices was printed incorrectly in a randomly selected race

17

E.g., Participant vote for Candidate Alice is printed as vote for Candidate Bob

18 of 38

Study Design

Ran battery of 9 experiments to evaluate what impacts detection:

18

19 of 38

Study Design

Ran battery of 9 experiments to evaluate what impacts detection:

  • Two variations on ballot style

19

Ballot Style

20 of 38

Study Design

Ran battery of 9 experiments to evaluate what impacts detection:

  • Two variations on ballot style
  • Deselecting a candidate rather than choosing a different candidate

20

Deselection

21 of 38

Study Design

Ran battery of 9 experiments to evaluate what impacts detection:

  • Two variations on ballot style
  • Deselecting a candidate rather than choosing a different candidate
  • Three sets of poll worker instructions

21

3

2

1

Poll worker Instruction

22 of 38

Study Design

Ran battery of 9 experiments to evaluate what impacts detection:

  • Two variations on ballot style
  • Deselecting a candidate rather than choosing a different candidate
  • Three sets of poll worker instructions
  • Two sets of instructions while asking the participants to vote a provided, random list of candidates

22

Slate + Instruction

23 of 38

Study Design

Ran battery of 9 experiments to evaluate what impacts detection:

  • Two variations on ballot style
  • Deselecting a candidate rather than choosing a different candidate
  • Three sets of poll worker instructions
  • Two sets of instructions while asking the participants to vote a provided, random list of candidates
  • Usage of a sign

23

Signage

24 of 38

Participant Demographics

Recruited 241 participants from in and around Ann Arbor

24

25 of 38

Study Design

Observed whether participants reviewed their ballots and reported discrepancies

25

26 of 38

Quantitative Results:

Non-intervention experiments

26

Intervention

Number of participants

Observed Reviewing

Reported Problem

Regular Ballots

31

42%

7%

Summary Ballots

31

32%

7%

Deselection

29

45%

7%

Subtotal/Mean

91

40%

7%

Ballot Style

Deselection

27 of 38

Quantitative Results:

Ineffective intervention experiments

27

Intervention

Number of participants

Observed Reviewing

Reported Problem

Signage

30

13%

7%

Poll-worker at check-in:

“Please remember to check your ballot carefully...”

30

47%

7%

Baseline

91

40%

7%

1

Poll worker

Instruction

Signage

28 of 38

Quantitative Results:

Effective Script experiments

28

Intervention

Number of participants

Observed Reviewing

Reported Problem

Poll-worker at scanner:

“Please keep in mind that the paper ballot is the official record of your vote.”

25

92%

16%

Poll-worker at scanner:

“Have you carefully reviewed each selection on your printed ballot?”

31

39%

13%

Baseline

91

40%

7%

3

2

1

Poll worker

Instruction

29 of 38

Quantitative Results:

Effective Slate experiments

29

Slate +

Instruction

Intervention

Number of participants

Observed Reviewing

Reported Problem

Slate + Poll-worker at scanner:

“...the paper ballot is the official record...”

13

100%

39%

Slate + Poll-worker at scanner:

“Have you carefully reviewed...?”

21

95%

86%

Baseline

91

40%

7%

30 of 38

Other Findings

  • Voters will find and report errors if they review the ballots
    • p < 0.01 in two-sample permutation test
    • Echoed in recent work by Kortum et al. [1]
  • Participants finding errors was correlated with the position of the error
    • i.e. the higher on the ballot an altered race was, the more likely a participant was to find it
    • Pearson’s of -0.64
  • Voters may be sensitive to particular language
    • Informing voters of the correct behavior appeared to be important, echoing work in other settings like [2]

30

[1] Kortum, Philip, Michael D. Byrne, and Julie Whitmore. "Voter Verification of BMD Ballots Is a Two-Part Question: Can They? Mostly, They Can. Do They? Mostly, They Don't." arXiv preprint arXiv:2003.04997 (2020).

[2] Akhawe, Devdatta, and Adrienne Porter Felt. "Alice in warningland: A large-scale field study of browser security warning effectiveness." In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), pp. 257-272. 2013.

31 of 38

Research Questions

  • Can voters detect errors introduced to the paper ballot? Yes
  • Are there interventions that can improve the rate at which people detect? Yes
  • Does ballot style matter? No
  • Does candidate position matter? Yes
  • Does type of manipulation matter? No
  • Are voters sensitive to the content of the warnings? Probably

31

32 of 38

Limitations

  • It is hard to simulate a realistic election environment
    • Participants still took it seriously, with some complaining about our randomized slate

32

33 of 38

Limitations

  • It is hard to simulate a realistic election environment
    • Participants still took it seriously, with some complaining about our randomized slate

“I noticed that there was a [Republican] selected and I'd almost never vote a republican”

33

34 of 38

Limitations

  • It is hard to simulate a realistic election environment
    • Participants still took it seriously, with some complaining about our randomized slate

  • Our sample is not generalizable
    • However, forthcoming work with a more diverse sample corroborates our findings [1]
  • Findings need further replication

34

[1] Kortum, Philip, Michael D. Byrne, and Julie Whitmore. "Voter Verification of BMD Ballots Is a Two-Part Question: Can They? Mostly, They Can. Do They? Mostly, They Don't." arXiv preprint arXiv:2003.04997 (2020).

35 of 38

Takeaways

  • Voters can review their printed BMD ballots, but mostly don’t.
  • Voters’ decision to review can be impacted by in-precinct intervention
  • The jury is still out on BMD security, but issues with BMD verification likely apply to other election domains too

35

36 of 38

Can Voters Detect Malicious Manipulation of Ballot Marking Devices?

Matt Bernhard

matber@umich.edu

@umbernhard

37 of 38

BMDs in Context

Voter-verified paper is a human-in-the-loop setting, where users need to understand and act on risk, i.e. that the BMD may print the wrong thing

Cranor lays out five means of communicating risk in a system [1], however only warnings apply in an election context.

Because voters are novices, warnings need to be contextual and provide an obvious recourse [2].

37

[1] Cranor, Lorrie F. "A framework for reasoning about the human in the loop." (2008).

[2] Bravo-Lillo, Cristian, Lorrie Faith Cranor, Julie Downs, and Saranga Komanduri. "Bridging the gap in computer security warnings: A mental model approach." IEEE Security & Privacy 9, no. 2 (2010): 18-26.

38 of 38

Experiments Summaries

38

Intervention

Specific Conditions

Number of participants

Observed Reviewing

Reported Problem

None

Ballot styles, deselection

91

40%

7%

Ineffective Interventions

Signage, “paper ballots is official record…” before voting

60

30%

7%

Asked to review after voting

“Paper ballot is official record,”

“did you carefully review…”

56

63%

14%

Asked to review after with slate

Two script variants above with random slate of candidates

34

97%

62%