Detecting Election Irregularities:
Using Benford’s Law to Analyze Voting Methods and County Clerk Partisanship
Victoria Comesañas, Nick Fleder, Patrick Granahan, Jacob Jaffe, and Melanie Zook
Election Systems
December 10, 2014
INTRODUCTION AND RESEARCH STATEMENT
Our research seeks to analyze two relationships: between voting irregularities and voting method, and between voting irregularities and the partisanship of county clerks compared to majority party of the county. We first look for voting irregularities by applying Benford’s Law on county-level election data in gubernatorial and presidential elections in Texas from 2004-2010. We then examine the frequency of irregularities across time by voting method and county clerk partisanship. We conclude by analyzing how certain voting methods are related to voting irregularities, and by examining whether the partisan affiliation of county clerks has any bearing on the occurrence or frequency of irregularities.
We attempt to explore the possibility of a relationship between county clerk partisanship and prevalence of voting irregularities, as well as between voting methods and prevalence of voting irregularities. County clerks play a key role in local election administration, as there are accusations that partisan clerks particularly in different-party counties may encourage or suppress turnout based on what might benefit their party (Kimball and Kropf 2006). Additionally, voting methods change from decade to decade and across counties, as election administrators continue discussing the benefits and drawbacks of paper, mechanical, and electronic voting methods. As the ultimate goal of an election is to accurately determine the will of the people, it is important to know how different voting methods affect the occurrence of voting irregularities. By examining how often election irregularities occur using different voting mechanisms, we create a benchmark against which new and current voting systems can be compared.
We begin by introducing a variety of literature which discusses election fraud, including its definition, impact, and potential detection methods. The two main methods discussed in the literature are Benford’s Law--a first step detection method--and the more refined Klimek model. We opt for the simpler model, partly because the Klimek model focuses on detection of high-level, officially mandated fraud in less democratic countries than the United States, usually in the form of the ballot-box stuffing. Next, we build on the theory in the literature to provide the foundation for our own research questions. We then refine our definitions of voting irregularities and voting methods, followed by a discussion of their relationships and our potential hypotheses. Next is a discussion of our research design and methodology. We then present our findings from our analysis, and conclude with their potential causal mechanisms and future research opportunities.
LITERATURE REVIEW
While limited work has been done to evaluate the relationship between election fraud, voting methods, and clerk partisanship, there is certainly a body of existing literature about methods for detecting election irregularities more broadly. These voting irregularities are frequently discussed in the world of elections, as the the media and policymakers believe that public perception of voter fraud suppresses turnout (Ansolabehere and Persily 2008) and decreases public trust in government (Schedler 1999). Although this belief has been largely disproven in the U.S. (Ansolabehere and Persily 2008), its pervasiveness still affects policymakers’ investment in understanding how election fraud works and how it may be eliminated.
The efficacy of methods for evaluating election fraud are unsurprisingly debatable, particularly as elections are, by design, difficult processes to study, purposefully shrouded in some mystery to avoid further fraud. This lack of process transparency is what Alvarez et al. (2008) call the “black box” problem, which is further confused by the lack of clarity regarding how fraud is even defined (Hood and Gillespie 2012). Researchers face difficulties such as the black box problem in developed, relatively fraud-free countries; analysis of election fraud becomes even more difficult in places with limited election infrastructure and security (Beber et al 2012).
One of the most cited methods of detecting election fraud is Benford’s Law. Frank Benford, a physicist from General Electric, found that naturally occurring numbers from a variety of sources more often “begin with the digit 1 than with the digit 9.” He also found that there is a logarithmic distribution of first digits when the numbers are composed of four or more digits (Benford 1938: 551). When applied to election results, the theory goes that election results that do not fit this normal logarithmic model indicate a possibility of election fraud. In order for this mechanism to work, though, there has to be an established expectation of what the digit distribution in elections without fraud should look like (Klimek et al. 2012). In essence, our research will establish a baseline with which to compare future Texas elections using county-data for a litany of voting technologies.
There is some debate about whether Benford’s law can be used a legitimate first-step detector of election fraud, to which even Mebane, who heavily endorses its potential as a tool, admits “one should perhaps not expect too much from a test that has only the vote counts themselves to work with” (Mebane 2011: 1). Deckert, Myagkov, and Ordeshook (2011) critique Benford’s law as an attempt at a one-size-fits-all solution that dismisses the complexity of elections. The authors suggest that Benford’s law ignores a number of relevant parameters, including geographic factors and their relation to ideology. We recognize these potential shortcomings and understand the limitations of this method. That being said, there are limitations to all methods of detecting election irregularities, and we continue to believe this remains the strongest method--though we use it as an indication of election irregularities, and not necessarily fraud.
Ansolabehere and Stewart (2005) in their study of residual votes and technology also note that there are a number of county-level variables and attributes that explain voting results. Our paper, like the Ansolabehere and Stewart study, studies the data across time and space, allowing us to explore the question of whether technological changes in voting administration cause irregular or conspicuous election returns.
A key question in our paper is whether new technologies have unintended outcomes on elections. In addition to Ansolabehere and Stewart’s work, other researchers have analyzed the role that voting methods have in election outcomes. Card and Moretti (2007) address the widespread belief that electronic voting is more vulnerable than other voting methods and subsequently leads to more instances of election fraud. Their analysis of county-level data in the 2000 and 2004 presidential elections reveals no significant correlation between the use of touch-screen voting machines and detected patterns of illegal vote manipulation. Gaps in this study that our analysis will address is the lack of differentiation between non-touch-screen voting methods, the potential exogenous factors caused by comparing elections in a time-series as opposed to cross-sectional study, and the limitations of county-level instead of precinct-level data.
Instead of focusing only on a “top-of-the-ballot” presidential race, Kimball and Kropf (2008) analyze the relationship between residual votes and election technology on “down-ballot” races. The researchers find that type voting machinery has a different effect on residual votes in top-ballot races versus down-ballot races, which they attribute to a usability issue instead of a fraud issue. Their results indicate the importance of taking into account what races are analyzed when studying voting methods.
A more thorough model is the one presented by the Klimek et al. (2012) study of election irregularities. Researchers used a parametric test to detect statistical patterns that indicate fraud. Using this type of parametric model means the results of the statistical analysis are not skewed by intervening variables such as varying levels of election turnout and size. However, this model presents a few problems for our research: most importantly, its first-step detection of election fingerprints rests on an assumption of ballot-box stuffing in several countries. Their analysis identified election fraud in Uganda and Russia; we assume that fraud in the United States, if it exists at all, takes a much different form that it does in those countries. Future research could take the model into account if we detect irregularities in certain counties in Texas to determine whether such unnatural digit distributions were the result of fraud or other voting error.
Our other independent variable, county clerk partisanship, has not been studied as extensively as voting technology. What research has been done, however, shows that it plays a critical role in debates about effective election administration. Recent controversies and accusations of county clerk fraud (Kimball and Kropf 2006) have led to a push for nonpartisan election administration that increases the fairness and neutrality of elections, leading to the public having higher faith in election accuracy (Hasen 2005). While most suspicions of fraud remain unsubstantiated, accusations remain. As Kimball and Kropf (2006) discuss, the common wisdom is that Democratic county clerks have the incentive to implement practices that increase turnout (which tends to lead to higher Democratic turnout) while Republican county clerks implement practices that suppress turnout. It follows that non-majority county clerks would have more of an incentive to do this in support of their partisan preferences. Our analysis of election irregularities in clerk partisanship will be a strong first step in testing this theory.
THEORETICAL EXPLANATIONS
With the understanding that Benford’s law can act as a “one size fits all” solution to identifying election fraud, we intend on using it as a “first-step detector” in identifying irregularities. Benford’s Law would show that the aggregation of the number of votes in Texas in a particular election cycle would follow a logarithmic distribution, with the majority of counties in any particular election cycle having 1 as the first digit of their voter turnout numbers. For example, if Harris County has 1.2 million votes in the 2006 gubernatorial election, the first digit would be 1. If the majority of the other counties in Texas also had 1 as their first digit, the resulting distribution of election data for the 2006 gubernatorial election would fall in line with Benford’s Law. With the understanding that our data may not fit Benford’s Law perfectly, our research seeks to establish a baseline distribution with which to compare future Texas elections with Benford’s Law, facilitating further research on irregularities in Texas elections.
We seek to expand also on Ansolabehere and Stewart’s findings of how technological advances and changes affect counties over time. Given their conclusion that local election administration accounts for a great deal of the variation in residual vote patterns, we sought to find data on the lowest level that we could: the county level. While we do not expect to see any irregularities compared to the Texas baseline, studies such as Kimbell and Kropf (2006) and Hasen (2005) suggest that local election administration is still worth examining. Further research can take into account an even smaller unit of analysis, the precinct, but any analysis of precinct-level data would only be fruitful if researchers know where to look. Hence, this paper seeks to establish if there are any red-flag counties that require further investigation.
With the importance of local election administration in mind, we use data gathered by Ansolabahere’s students on the partisanship of the county clerk in each particular county as compared to the partisanship of the county as a whole. As noted by Kimball and Kropf (2006), it is commonly believed that Democratic and and Republican county clerks have the incentive to enact practices that increase and suppress turnout, respectively. As mentioned above, our understanding is that county clerks of a different political party than the county they serve may have an added incentive to act in accordance to their partisan preferences. Consequently, any election irregularities which correlate to the occurrence of differences between the partisanship of the county clerk and the partisanship of the county may be an indication of a particular clerk changing election administration in support of their own preferences.
We intend to expand on Ansolabehere and Stewart’s study by utilizing our own election irregularities data instead of their residual vote data. Use of election irregularities data will provide a better idea of the different voting methods’ potential connection to intentional election fraud, as opposed to residual vote data. Election irregularities might include high levels of residual votes among many other indications (such as a strong difference between the average turnout and the mean fraction of votes for the winning party), but residual votes themselves more likely result from accidental misuse of voting equipment or misunderstanding of election procedure than intentional fraud.
KEY CONCEPTS, RELATIONSHIPS, AND TESTABLE HYPOTHESES
Our research seeks to explore any tie between voting irregularities and voting method in Texas in the time frame of 2004-2010. For the purpose of our research, “voting or election irregularities” refers to elections in which the distribution of votes are excessively unlikely, according to Benford’s Law. “Voting method” speaks to mechanically how a person submitted their ballot, be it on paper ballots, punch cards, lever voting, or a direct recording electronic voting machine (DRE).
It is important to note that identifying a relationship between election irregularities and voting methods is not enough to prove there is actual election fraud. The irregularities themselves do not explicitly connect to fraud, but if correlated with county clerk party match or voting method, may be a useful “red flag” to indicate a need to examine county results more closely. What it can reveal is a way of measuring voting technology reliability. Those technologies that correspond to the lowest rates of detected election irregularities are likely more reliable forms of voting methods for election administrators to use. Further research can build on our methodology for detecting discrepancies to further prove which voting methods are more reliable than others.
There are two layers to this analysis. Our analysis identifies election irregularities, but the focus of the research is the relationship between our two independent variables--voting method or the partisan affiliation of the county clerks in Texas--and the dependent variable, whether or not we have identified voting irregularity in a particular election. We can then determine if certain voting methods are more prone to voting irregularity and consider why that may be.
We suspect that few irregularities will be found, and that they will not be of a fraudulent nature. Thus, we hypothesize that changes in voting method will have no bearing on the frequency of irregularities, which we assume will be related to voting error. Likewise, we assume that counties with county clerks of a different party than the majority will not have higher rates of irregularity for the same reasons.
METHODOLOGY
In this analysis, we use Benford’s law (as discussed above) to evaluate election irregularities by voting technology and county clerk partisanship in Texas counties in the 2004, 2006, 2008, and 2010 elections. For the voting technology piece of this study, the Election Administration and Voting Survey is the strongest available dataset. Every two years, the Election Assistance Commission administers the EAVS, a survey of election administrators, to collect state and county-level data on how federal elections are conducted (U.S. Election Assistance Commission 2014). As the most comprehensive dataset on election administration in the country, the EAVS is the best source of data to analyze the correlation between identifiable election irregularities and other variables.
To analyze this voting technology, we looked at the results of the EAVS, which are publicly available on the EAC website. Section F7 on the EAVS refers to the number and type of voting equipment used in each particular election, as well as how the machines were used in the voting process and where the ballots for the machine type were tallied. The survey lists these types of equipment:
The second variable we examined was the partisanship of county clerks compared to the majority party in the county. To evaluate this, we used a dataset compiled by Ansolabehere’s students that lists clerks’ partisanship and majority party by county. This allowed us to easily use the same Benford’s analysis used on voting methods and apply it to the county clerk variable.
FINDINGS
We first ran a frequency distribution of highest-order digits, similar to Benford’s Law, on the summed distribution of Republican and Democratic candidates by county; the results for the 2004 presidential election are displayed in figure 1 below. This gives us an accurate picture of overall voting by county as well as an overall picture of the votes most likely to have irregularities. We only focused on Republicans and Democrats because if there is someone maliciously or accidentally causing election irregularities, it stands to reason that it would be to one of the two parties that could actually win. Our graphs contain the prevalence of each digit as the leading digit along with a 25% plus/minus box around the expected Benford’s value. These represent what we might reasonably expect to see as predicted by Benford’s Law.
The digit distribution that we see in figure 1 is markedly similar, although not the same, as Benford’s. Four of the nine digits fall within the 25% box and the others are relatively close. The general trend of the graph also matches the general shape of Benford’s expected distribution, with a high percentage at the one’s digit, a second highest two’s digit, and then a general decrease along the last digits. There is some difference from the expected, but this measure simply establishes a baseline for Texas as a whole, and is not such an extreme difference as to be evidence of voting irregularity. As Benford’s Law attempts to provide a universal baseline, having a Texas-specific baseline may be useful and more relevant when comparing subdivisions within Texas.
Figure 1 - Benford’s Law Analysis on 2004 Presidential Election Votes in Texas
Additionally, when analyzing this graph it is important to remember that Benford’s Law is a “one size fits all” model. The purpose of Benford’s Law is that it is universal, but that is also its weakness. Since Benford’s does not take into account the specifics of a dataset it must be put into the context of the data being studied.
Figure 1 above follows the most general form of an expected Benford’s distribution. It is difficult to identify from this any sort of voting irregularity, at least in part because of the ambiguity within Benford’s Law. That does not mean that there are no voting irregularities to be identified within Texas counties. Instead, we should look at situations in which voting irregularities might be more likely to occur. Using Benford’s Law we can then examine these situations, and attempt to detect irregularities. The following figures focus on voting methods and county clerk partisanship in the 2004-2010 gubernatorial and presidential elections.
The first situation with a potentially heightened probability of voting irregularity is in counties with different methods of voting. Previous research has shown that the number of residual votes varies by voting method (Ansolabehere and Stewart 2005), so if there is a greater chance for residual votes then there may also be a greater chance of other election irregularities. A voting system that causes residual votes may also be either inherently unreliable or more prone to irregularities.
Figure 2 - Texas Counties Colored by Voting Equipment Type
As seen in the the above graph, the majority of Texas counties utilize either paper or optically scanned ballots. In fact, they are the only types of voting in Texas employed by at least 10 counties. The other five methods suffer from a small sample size problem and may not be subject to legitimate analysis at all.
Figure 4 - Benford’s Law Analysis on Texas 2004 Presidential Election Votes in counties using paper ballots
The distributions in figures 3 and 4 both follow the same general form as the Benford’s distribution; the highest one digit, then two, then a quick decrease. The particulars are notable here, however. The optically scanned counties have a higher one’s digit, while the paper ballots have a much slower rate of decrease. In counties with paper ballots that may be counted by hand, there might be a greater possibility for irregularities. We believe, however, that a far more likely and less complex reason has to do with the average size of the counties that utilize paper ballots and optically scanned ballots.
Figure 5 (from EAVS survey)
Figure 6 (from EAVS survey)
Figures 5 and 6 show the number of counties using paper and optically scanned ballots and the frequencies of each of their sizes. The paper ballot counties tend to be smaller and most frequently in the 1,000s, and a little less in the 2,000s. The counties that use optically-scanned ballots tend to be larger and most frequently in the 10,000s with much fewer in the 20,000s. The paper ballot counties tend to be smaller, and although the majority of them have total votes in the 1,000s, the size difference between a county in the 1,000s and a county in the 2,000s is not particularly large. The size difference between a county in the 10,000s and a county in the 20,000s is significantly larger, however. These smaller paper ballot counties tend to be in the sparsely populated northwest of the state and the optically scanned ballots are common nearly everywhere else, including the east towards Louisiana and the southern valley of Texas, near the border.
There are counties in Texas in which the party of the majority and the party of the county clerk, who is directly responsible for voting administration, do not match. If the county clerk in one of these counties were so motivated, it might be possible for the clerk to use their position to influence the election in favor of their party. In some extreme cases this may take the form of voting irregularities of the type noticeable under a Benford’s Law analysis.
As seen in figure 7, in many counties in Texas there is a difference between the party of the county clerk and the party of the majority. This should be more than enough to provide a sample for both the counties with the same party for the majority and the county clerk and the counties that have different parties.
There is not a noticeable difference between a Benford’s Law analysis for the counties in which the party of the majority and the party of the county clerk are different and the counties in which the party is the same. There is also not the same issue of tiny sample size that made it difficult to analyze some of the voting methods; there were 65 counties with matching political parties and 153 in which they did not match. The only difference between these graphs is a larger two’s digit for counties with the same political party and then lower subsequent digits. This difference appears to be negligible, especially after examining the distributions of the size of counties with party match and party difference. Counties with party match appear to be more evenly distributed in the size range between 2,000 and 9,000, while counties without the match have two dramatic peaks at 1,000 and 10,000.
Figure 10 (from EAVS survey)
Figure 11 (from EAVS survey)
Another case that we chose to look at for election irregularities is gubernatorial elections. These elections tended to be less regular in relation to each other. Additionally, the 2010 gubernatorial election distribution closely resembles the 2004 presidential distribution, while the 2006 gubernatorial distribution looks much less regular compared to the presidential baseline.
Figure 12 - Benford’s Law analysis from 2006 Texas Gubernatorial Election
Figure 13 - Benford’s Law analysis on 2010 Texas Gubernatorial Election
DISCUSSION AND CONCLUSION
In analyzing these findings, there are a number of possible causal mechanisms that could explain why we see different digit distributions than expected. Benford’s itself offers no proposed explanations, so our theories build off other existing literature. These proposed causal mechanisms involve variables such as ethnicity and turnout as potential explanations for digit distribution variations.
In attempting to explain why our digit distributions differ between gubernatorial and presidential elections, we look first at turnout. Turnout is smaller in midterm elections than in presidential, with Texas lagging behind the national average in presidential turnout by a fairly substantial amount, an average of 10.35% in the 2004 and 2008 national elections. The average difference between Texas turnout presidential election and the subsequent gubernatorial election (2004 and 2006; 2008 and 2010) was 19.75% (The Texas Politics Project 2014), which accounts for enough people in some counties to shift the first digit of the total votes. For example, Galveston County, which had 105,981 voters in the 2004 presidential election, had a different leading digit in the 2006 gubernatorial election, where 75,891 voters went to the polls (Texas Secretary of State 2014). We suspect that counties like this are fairly common, and skewed the results in such a manner that there were two distinct distributions based on this turnout effect.
In attempting to explain the reason for differences in digit distribution by clerk partisanship, it is important to take into account the relationship between turnout and ethnic makeup of a county. Texas has the second-highest concentration of Hispanic voters (Motel and Patten 2012), and national data shows that Hispanics tend to vote Democrat (Thorburn 2014). Their turnout is low, though; research has found that Hispanics have about the same turnout in presidential elections as other groups with similar socioeconomic and political contexts, but they turn out at a rate of about 10 percentage points less than Anglos in midterm elections (Cassel 2002). Because of this interaction, is possible that while a county may be majority Democratic, those Democrats are primarily Hispanic and thus turn out at lower rates for non-presidential elections. This both increases the likelihood of a non-Democratic county clerk and the likelihood that any effects of non-majority-party clerks are mitigated by Hispanic turnout.
We do understand that our research is not comprehensive, and possibly creates more questions than it answers. Further research can combine the Benford methodology with a host of other factors, including how recently new machines or voting methods have been instituted, the demographics of the county, how weather affects turnout, and so on. Importantly, we can study states with mixed modes of voting, to explore whether multiple methods of voting discourage or encourage fraud.
We conclude that using Benford’s Law in Texas elections 2004-2010, there are no significant election irregularities by voting method or county clerk partisanship. This supports our hypotheses, as we did not expect to find irregularities correlated with voting method or the partisanship of the county clerk. We have established a baseline for further research using Benford’s Law for Texas elections to identify election irregularities. Future election irregularities correlated with other variables, including those which we have used in our study, may be an indication for the need to investigate a particular election cycle or county further for potential fraud. However, it is more likely that future election irregularities can be explained by the causal mechanisms discussed in our study, and are not a result of any intent to undermine the democratic process.
REFERENCES
Alvarez, R. Michael, Thad E. Hall, and Susan D. Hyde. 2008. “Studying Election Fraud.” Pp. 1–17 in R. Michael Alvarez, Thad E. Hall, and Susan D. Hyde, eds., Election Fraud.
Washington, DC: Brookings Institution Press.
Ansolabehere, Stephen and Nathaniel Persily. 2008. “Vote Fraud in the Eye of the Beholder: The Role of Public Opinion in the Challenge to Voter.” Harvard Law Review, 121(7):1737-1774.
Ansolabehere, Stephen and Charles Stewart III. 2005. “Residual Votes Attributable to Technology.” The Journal of Politics, 67(2):365-389.
Beber, Bernd, Alexandra Scacco, and R. Michael Alvarez. 2012. “What the Numbers Say: A Digit-Based Test for Election Fraud.” Political Analysis, 20(2):211-234.
Benford, Frank. 1938. “The Law of Anomalous Numbers.” Proceedings of the American Philosophical Society, 78(4):551-572.
Card, David and Enrico Moretti. 2007. “Does Voting Technology Affect Election Outcomes? Touchscreen Voting and the 2004 Presidential Election.” The Review of Economics and Statistics, 89(4):660-673.
Cassel, Carol A. 2002. “Hispanic Turnout: Estimates from Validated Voting Data.” Political Research Quarterly, 55(2):391-408.
Deckert, Joseph and Mikhail Myagkov and Peter C. Ordeshook. 2011. “Benford's Law and the Detection of Election Fraud.” Political Analysis, 19 (3):245-268.
Hasen, Richard L. 2005. “Beyond the Margin of Litigation: Reforming U.S. Election Administration to Avoid Electoral Meltdown.” Washington & Lee Law Review, 62: 937.
"Historical Election Results." 1992 - Current ELECTION HISTORY. Texas Office of the Secretary of State, 2014. Web. <http://elections.sos.state.tx.us/elchist.exe>.
Hood, M.V. and William Gillespie. 2012. “They Just Do Not Vote Like They Used To: A Methodology to Empirically Assess Election Fraud.” Social Science Quarterly, 93(1):76-94.
Klimek, Peter, Yuri Yegorov, Rudolf Hanel, Stefan Thurner. 2012. “Statistical detection of systematic election irregularities.” Proceedings of the National Academy of Sciences USA 109: 16469-16473. (http://www.pnas.org/content/109/41/16469.short).
Kimball, David C. and Martha Kropf. 2006. “The Street-Level Bureaucrats of Elections: Selection Methods for Local Election Officials.” Review of Policy Research, 23(6): 1257-1268.
Kimball, David C. and Martha Kropf. 2008. “Voting Technology, Ballot Measures, and Residual Votes.” American Politics Research, 36(4):479-509.
Motel, Seth and Eileen Patten. 2012. “Latinos in the 2012 Election: Texas.” Pew Research Hispanic Trends Project, October 1. Retrieved December 10, 2014 (http://www.pewhispanic.org/2012/10/01/latinos-in-the-2012-election-texas/).
Schedler, Andreas. 1999. “Civil Society and Political Elections: A Culture of Distrust?” Annals of the American Academy of Political and Social Science, 565:126-141.
"The Texas Politics Project." Texas Politics. University of Texas, 2014. Web. <http://texaspolitics.utexas.edu/archive/html/vce/features/0302_01/turnout.html>.
Thorburn, Wayne. 2014. “Hispanics Won’t Turn Texas Blue.” Politico, October 5. Retrieved December 10, 2014 (http://www.politico.com/magazine/story/2014/10/why-texas-democrats-shouldnt-count-on-the-hispanic-vote-111508.html#.VIiwlTHF9Hw).
U.S. Election Assistance Commission. 2014. “Election Administration and Voting Survey.” Retrieved November 4, 2014. http://www.eac.gov/research/election_administration_and_voting_survey.aspx.