So asks a charming preprint by Achter, Erman, Kedlaya, Wood, and Zureick-Brown. (2/5 Wisconsin, 1/5 ex-Wisconsin!) The paper, I’m happy to say, is a result of discussions at an AIM workshop on arithmetic statistics I organized with Alina Bucur and Chantal David earlier this year.

Here’s how they think of this. By a random curve we might mean a curve drawn uniformly from M_g(F_q). Let X be the number of points on a random curve. Then the average number of points on a random curve also has a geometric interpretation: it is

What about

?

That’s just the average number of *ordered p**airs* of distinct points on a random curve; the expected value of X(X-1).

If we can compute all these expected values, we have all the moments of X, which should give us a good idea as to its distribution. Now if life were as easy as possible, the moduli spaces of curves would have no cohomology past degree 0, and by Grothendieck-Lefschetz, the number of points on M_{g,n} would be q^{3g-3+n}. In that case, we’d have that the expected value of X(X-1)…(X-n) was q^n. Hey, I know what distribution that is! It’s Poisson with mean q.

Now M_g *does* have cohomology past degree 0. The good news is, thanks to the Madsen-Weiss theorem (*née *the Mumford conjecture) we know what that cohomology is, at least stably. Yes, there are a lot of unstable classes, too, but the authors propose that heuristically these shouldn’t contribute anything. (The point is that the contribution from the unstable range should look like traces of gigantic random unitary matrices, which, I learn from this paper, are bounded with probability 1 — I didn’t know this, actually!) And you can even make this heuristic into a fact, if you want, by letting q grow pretty quickly relative to g.

So something quite nice happens: if you apply Grothendieck-Lefschetz (actually, you’d better throw in Kai Behrend’s name, too, because M_g is a Deligne-Mumford stack, not an honest scheme) you find that the moments of X *still* agree with those of a Poisson distribution! But the contribution of the tautological cohomology shifts the mean from q to q+1+1/(q-1).

This is cool in many directions!

- It satisfies one’s feeling that a “random set,” if it carries no extra structure, should have cardinality obeying a Poisson distribution — the “uniform distribution” on the groupoid of sets. (Though actually that uniform distribution is Poisson(1); I wonder what tweak is necessary to be able to tune the mean?)
- I once blogged about an interesting result of Bucur and Kedlaya which showed that a random smooth complete intersection curve in P^3 of fixed degree had slightly fewer than q+1 points; in fact, about q+1 – 1/q + o(q^2). Here the deviation is negative, rather than positive, as the new paper suggests is the case for general curves; what’s going on?
- I have blogged about the question of average number of points on a random curve before. I’d be very interested to know whether the new heuristic agrees with the answer to the question proposed at the end of that post; if g is a large random matrix in GSp(Z_ell) with algebraic eigenvalues, and which multiplies the symplectic form by q, and you condition on Tr(g^k) > (-q^k-1) so that the “curve” has nonnegatively many points over each extension of F_q, does this give something like the distribution the five authors predict for Tr(g)? (Note: I don’t think this question is exactly well-formed as stated.)