Statistics and the Iranian election, ctd.

More statisticians are looking at the Iranian voting data for signs of fraud. Walter Mebane (University of Michigan) looks at district-level vote counts to check for violations of Benford’s Law. Benford’s Law is a characteristic of real-live numbers: the first digit is a 1 almost one-third of the time, with higher digits appearing increasingly infrequently. (It’s another example of a power-law distribution, such as we looked at with regard to city populations.) Elections that have been manipulated by hand are sometimes revealed by disaggregated poll counts violating Benford’s Law. Unfortunately, with only district-level data, deviation from Benford’s Law is unlikely…

More statisticians are looking at the Iranian voting data for signs of fraud. Walter Mebane (University of Michigan) looks at district-level vote counts to check for violations of Benford's Law. Benford's Law is a characteristic of real-live numbers: the first digit is a 1 almost one-third of the time, with higher digits appearing increasingly infrequently. (It's another example of a power-law distribution, such as we looked at with regard to city populations.) Elections that have been manipulated by hand are sometimes revealed by disaggregated poll counts violating Benford's Law.

Unfortunately, with only district-level data, deviation from Benford's Law is unlikely for this aggregated even if there were manipulation at the polling-station level, and indeed Mebane finds no such evidence in the Iranian Election data.

On the other hand, fitting an overdispersed Binomial model to the data reveals nine outlier districts where Ahmadinejad received an unusually high proportion of the vote (compared to Mousavi) — whether these are reasonable depends on knowledge of the political geography of Iran.

Conditioning the 2009 results on the 2005 results (when a boycott led many liberal voters — presumed Mousavi supporters — to not vote) results in a fit that one would expect "if the political processes like those that normally prevail in election in other places were also at work in the Iranian election of 2009". Yet many of those same districts as in the last analysis still appear as outliers where Ahmadinejad received significantly more support than predicted by the model. Mebane concludes:

In general, combining the 2005 and 2009 data conveys the impression that a substantial core of the 2009 results reﬂected natural political processes. In 2009 Ahmadinejad tended to do best in towns where his support in 2005 was highest, and he tended to do worst in towns where turnout surged the most. These natural aspects of the election results stand in contrast to the unusual pattern in which all of the notable discrepancies between the support Ahmadinejad actually received and the support the model predicts are always negative. This pattern needs to be explained before one can have conﬁdence that natural election processes were not supplemented with artiﬁcial manipulations.

All of the analysis was done in R: the code and data and the PDF report are all available for download.

Stochastic Democracy: Iran Elections – FInal Update for now (via Sullivan)

Link to original post