So asks a charming preprint by Achter, Erman, Kedlaya, Wood, and Zureick-Brown. (2/5 Wisconsin, 1/5 ex-Wisconsin!) The paper, I’m happy to say, is a result of discussions at an AIM workshop on arithmetic statistics I organized with Alina Bucur and Chantal David earlier this year.

Here’s how they think of this. By a random curve we might mean a curve drawn uniformly from M_g(F_q). Let X be the number of points on a random curve. Then the average number of points on a random curve also has a geometric interpretation: it is

What about

?

That’s just the average number of *ordered p**airs* of distinct points on a random curve; the expected value of X(X-1).

If we can compute all these expected values, we have all the moments of X, which should give us a good idea as to its distribution. Now if life were as easy as possible, the moduli spaces of curves would have no cohomology past degree 0, and by Grothendieck-Lefschetz, the number of points on M_{g,n} would be q^{3g-3+n}. In that case, we’d have that the expected value of X(X-1)…(X-n) was q^n. Hey, I know what distribution that is! It’s Poisson with mean q.

Now M_g *does* have cohomology past degree 0. The good news is, thanks to the Madsen-Weiss theorem (*née *the Mumford conjecture) we know what that cohomology is, at least stably. Yes, there are a lot of unstable classes, too, but the authors propose that heuristically these shouldn’t contribute anything. (The point is that the contribution from the unstable range should look like traces of gigantic random unitary matrices, which, I learn from this paper, are bounded with probability 1 — I didn’t know this, actually!) And you can even make this heuristic into a fact, if you want, by letting q grow pretty quickly relative to g.

So something quite nice happens: if you apply Grothendieck-Lefschetz (actually, you’d better throw in Kai Behrend’s name, too, because M_g is a Deligne-Mumford stack, not an honest scheme) you find that the moments of X *still* agree with those of a Poisson distribution! But the contribution of the tautological cohomology shifts the mean from q to q+1+1/(q-1).

This is cool in many directions!

- It satisfies one’s feeling that a “random set,” if it carries no extra structure, should have cardinality obeying a Poisson distribution — the “uniform distribution” on the groupoid of sets. (Though actually that uniform distribution is Poisson(1); I wonder what tweak is necessary to be able to tune the mean?)
- I once blogged about an interesting result of Bucur and Kedlaya which showed that a random smooth complete intersection curve in P^3 of fixed degree had slightly fewer than q+1 points; in fact, about q+1 – 1/q + o(q^2). Here the deviation is negative, rather than positive, as the new paper suggests is the case for general curves; what’s going on?
- I have blogged about the question of average number of points on a random curve before. I’d be very interested to know whether the new heuristic agrees with the answer to the question proposed at the end of that post; if g is a large random matrix in GSp(Z_ell) with algebraic eigenvalues, and which multiplies the symplectic form by q, and you condition on Tr(g^k) > (-q^k-1) so that the “curve” has nonnegatively many points over each extension of F_q, does this give something like the distribution the five authors predict for Tr(g)? (Note: I don’t think this question is exactly well-formed as stated.)

&latex \mathbb{F}_{q=2}?$ ;)

Arrgh … ;)

Not sure I get your point?

You can start tuning the mean by taking a product of copies of the groupoid of finite sets, or equivalently looking at the groupoid of -colored finite sets. The total size of such a finite set produces a Poisson distribution with parameter .

Two quick thoughts:

Firstly, applying the unitary matrix heuristic to deduce tiny trace would reasonably require that the Galois action on unstable cohomology is close to irreducible. Is that plausible? There may be natural “Hecke-type” correspondences between different M_gs, like “curve X is a covering of curve Y”, but I don’t know how far they obstruct that.

Secondly, concerning testability, can one sample a random curve from this measure, for moderately large g (or even not so moderately large)? One thought would be to take to look at a high degree branched cover of P^1 and pick the ones defined over F_q, but this sounds uncomputable.

The irreducibility is a good point! I’m sure you’re right that correspondences like those you mention could calve off some small pieces, but I think you would be OK (for large but fixed q) so long as the number of constituents was expected to be at worst exponential…? Re testability: that’s a great question, actually. Certainly e.g. for genus 3 you can sample a plane quartic and a hyperelliptic at random in the appropriate proportions but that’s probably much MORE moderate than you mean. Almost by definition it’s hard to imagine how you would do this once M_g is not unirational, right?