On the Iranian election returns, in Slate

In today’s Slate I write about the claim that the official Iranian election returns are too linear to be true.

The graph (via Tehran Bureau) looks pretty amaing; but in fact, as I explain, it’s pretty much what you’d expect real election data to look like.

One point there wasn’t room for in the piece; if you look carefully at the chart above, you’ll see that the folks at Tehran Bureau got the election returns to fit the line y = 0.5238x – 742642 very well.  But in some sense that’s irrelevant, unless there’s some reasonable expectation that  clerical powers-that-be would want faked election numbers to follow a funny line with a negative y-intercept.  When R.A. Fisher went after Gregor Mendel, it wasn’t just because Mendel’s results looked suspiciously regular; it was because they looked suspiciously close to Mendel’s theoretical predictions.  If Mendel shaded the data, consciously or not, that’s the direction it would go.

I mean, I can fit a really nice quadratic in x to the Iranian election data — or, for that matter, U.S. election data — but absent any reason to posit a vast parabola-loving conspiracy, it’s just not that suspicious.

Update: (June 18)  Lots more material around the web about Iranian election stats.  A preprint on the arXiv claims the official numbers violate Benford’s law, but Andrew Gelman says no. On the other hand, via Mark Blumenthal at Pollster, I learn that Walter Mebane at Michigan finds some suspicious-looking irregularities in the town-level data.

5 thoughts on “On the Iranian election returns, in Slate

  1. dbk says:

    All well and good, but I still haven’t seen any explanation for the fact that, in elections, votes _don’t_ generally arrive in homogenous bunches. To use the example of US elections, first we get the blue states in the Eastern time zone, then swing states like Ohio and Pennsylvania, then red states like Texas — what would happen if you tried to plot the vote totals of US presidential candidates as they came in? Presumably in Iran, too, some districts report their totals before others.

    Totally willing to believe I’m off base here, but this is the question that lingers in my mind when this issue gets debated.

  2. JSE says:

    what would happen if you tried to plot the vote totals of US presidential candidates as they came in?

    Great question. Somebody did this and, if I remember correctly, got results essentially the same as Nate’s. But looking around now I’m unable to find this.

    It does seem perfectly reasonable to assume some kind of time bias in the votes; but in order to use this to claim that the reported results are “too consistent” you’d have to have a principled estimate for the size of that bias, which I don’t see. (And note that the votes as reported by the government actually DO start out more favorable to Ahmadinejad, and steadily even out as the election goes on.)

  3. Mark says:

    I’m afraid I had a number of problems with your piece in Slate, starting with the moment you confused average absolute deviation with standard deviation. I listed the other major problems at:


    I was pretty harsh but I did run it by another statistician before I posted it and he had the same complaints. If you think I was unfair, I’d be glad to post your rebuttal.


  4. JSE says:

    It’s great to get such a careful reading! The points you make are good ones and underline the difficulty of balancing accessibility to a mass audience with mathematical precision. I promise, I do know the actual definition of standard deviation! But editors don’t want to print the definition of standard deviation in their magazines — and I agree with them. In my opinion, for discussion as coarse as this, there is simply no point in distinguishing average absolute deviation from standard deviation. It sounds like you disagree, though.

    There’s also the problem of technical vs. lay terminology — i.e. you’re certainly right that the word “skew” has a technical meaning, which is not what I was going for with my use of the word in this article. But I think the segment of Slate’s readership who knows the technical meaning of the word “skew” is tiny, and consists only of people who, like you, know by context that the word isn’t used there in its technical sense! Similarly, I think you might be putting too much weight on the word “expect” — I’m not sure why you think I mention average absolute deviation unless you take my use of “expect” to mean that I’m literally talking about expected value, which I’m not.

    The question of whether earlier-arriving ballots are biased for or against Ahmadenijad is a real one, which came up in the comment prior to yours. Perhaps it would have been better to say that the official election returns are consistent with the hypothesis that the numbers were real, and time of report was roughly independent from Ahmadinejad’s support. Do you disagree?

