## How many points does a random curve over F_q have?

So asks a charming preprint by Achter, Erman, Kedlaya, Wood, and Zureick-Brown.  (2/5 Wisconsin, 1/5 ex-Wisconsin!)  The paper, I’m happy to say, is a result of discussions at an AIM workshop on arithmetic statistics I organized with Alina Bucur and Chantal David earlier this year.

Here’s how they think of this.  By a random curve we might mean a curve drawn uniformly from M_g(F_q).  Let X be the number of points on a random curve.  Then the average number of points on a random curve also has a geometric interpretation: it is

$|M_{g,1}(\mathbf{F}_q)|/|M_{g}(\mathbf{F}_q)|$

$|M_{g,2}(\mathbf{F}_q)|/|M_{g}(\mathbf{F}_q)|$?

That’s just the average number of ordered pairs of distinct points on a random curve; the expected value of X(X-1).

If we can compute all these expected values, we have all the moments of X, which should give us a good idea as to its distribution.  Now if life were as easy as possible, the moduli spaces of curves would have no cohomology past degree 0, and by Grothendieck-Lefschetz, the number of points on M_{g,n} would be q^{3g-3+n}.  In that case, we’d have that the expected value of X(X-1)…(X-n) was q^n.  Hey, I know what distribution that is!  It’s Poisson with mean q.

Now M_g does have cohomology past degree 0.  The good news is, thanks to the Madsen-Weiss theorem (née the Mumford conjecture) we know what that cohomology is, at least stably.  Yes, there are a lot of unstable classes, too, but the authors propose that heuristically these shouldn’t contribute anything.  (The point is that the contribution from the unstable range should look like traces of gigantic random unitary matrices, which, I learn from this paper, are bounded with probability 1 — I didn’t know this, actually!)  And you can even make this heuristic into a fact, if you want, by letting q grow pretty quickly relative to g.

So something quite nice happens:  if you apply Grothendieck-Lefschetz (actually, you’d better throw in Kai Behrend’s name, too, because M_g is a Deligne-Mumford stack, not an honest scheme) you find that the moments of X still agree with those of a Poisson distribution!  But the contribution of the tautological cohomology shifts the mean from q to q+1+1/(q-1).

This is cool in many directions!

• It satisfies one’s feeling that a “random set,” if it carries no extra structure, should have cardinality obeying a Poisson distribution — the “uniform distribution” on the groupoid of sets.  (Though actually that uniform distribution is Poisson(1); I wonder what tweak is necessary to be able to tune the mean?)
• I once blogged about an interesting result of Bucur and Kedlaya which showed that a random smooth complete intersection curve in P^3 of fixed degree had slightly fewer than q+1 points; in fact, about q+1 – 1/q + o(q^2).  Here the deviation is negative, rather than positive, as the new paper suggests is the case for general curves; what’s going on?
• I have blogged about the question of average number of points on a random curve before.  I’d be very interested to know whether the new heuristic agrees with the answer to the question proposed at the end of that post; if g is a large random matrix in GSp(Z_ell) with algebraic eigenvalues, and which multiplies the symplectic form by q, and you condition on Tr(g^k) > (-q^k-1) so that the “curve” has nonnegatively many points over each extension of F_q, does this give something like the distribution the five authors predict for Tr(g)?  (Note:  I don’t think this question is exactly well-formed as stated.)

## Silas Johnson on weighted discriminants with mass formulas

My Ph.D. student Silas Johnson just posted his thesis paper to the arXiv, and it’s cool, and I’m going to blog about it!

How should you count number fields?  The most natural way is by discriminant; you count all degree-n number fields K with a given Galois group G in S_n and discriminant bounded in absolute value by B.  This gives you a value N_G(B) whose asymptotic behavior in B you might want to study.  The classical results and exciting new ones you’ve heard about — Davenport-Heilbron, Bhargava, and all that — generally concern counts of this kind.

But there are reasons to consider other kinds of counts.  I once had a bunch of undergrads investigate the behavior of N_3(X,Y), the number of cubic fields whose discriminant had squarefree part at most X and maximal square divisor at most Y.  This gives a more refined picture of the ramification behavior of the fields.  Asymptotics for this are still unknown!  (I would expect the main term to be on order $X Y^{1/2}$, but I don’t know what the secondary terms should be.)

One nice thing about the discriminant, though, is that it has a mass formula.  In brief:  a map f from Gal(Q_p) to S_n corresponds to a degree-n extension of Q_p, which has a discriminant (a power of p) which we call Disc(f).  Averaging Disc(f)^{-1} over all homomorphisms f gives you a polynomial in p^{-1}, which we call the local mass.  Now here’s the remarkable fact (shown by Bhargava, deriving from a formula of Serre) — there is a universal polynomial P(x) such that the local mass at p is equal to P(p^{-1}) for every P.  This is not hard to show for the tame primes p (you can see this point discussed in Silas’s paper or in the paper by Kedlaya I linked above) but that it holds for the wild primes is rather mysterious and strange.  And it certainly seems to ratify the idea that there’s something especially nice about the discriminant.  What’s more, this polynomial P is not just some random thing; it’s the product over p of P(p^{-1}) that gives the constant in Bhargava’s conjectural asymptotic for the number of number fields for degree n.

But here’s the thing.  If we replace G by a subgroup of S_n, there need not be a universal mass formula anymore.  Kedlaya (and Daniel Gulotta, in the appendix) show lots of examples.  The simplest example is the dihedral group of order 8.

All is not lost, though!  Wood showed in 2008 that you could fix this problem by replacing the discriminant of a D_4-extension with a different invariant.  Namely:  a D_4 quartic field M has a quadratic subextension L.  If you replace Disc(L/Q) with Disc(L/Q) times the norm to Q of Disc(L/M), you get a different invariant of M — an example of what Silas calls a “weighted discriminant” — and when you compute the local mass according to {\em this} invariant, you get a polynomial in p^{-1} again!

So maybe Wood’s modified discriminant, not the usual discriminant, is the “right” way to count dihedral quartics?  Does the product of her local masses give the right asymptotic for the number of D_4 extensions with Woodscriminant at most B?

It’s not at all clear to me how, if at all, you can cook up a modified discriminant for an arbitrary group G that has a universal mass formula.  What Silas shows is that having a mass formula is indeed special; when G is a p-group, there are only finitely many weighted discriminants that have one.  Silas thinks, and so do I, that this is actually true for every finite group G, and that some version of his approach will eventually prove this.  And he classifies all such weighted discriminants for groups of size up to 12; for any individual G, it’s a computation which can be made nicely algorithmic.  Very cool!

## This Week’s Finds In Number Theory

Twenty years ago yesterday, John Baez posted the first installment of This Week’s Finds in Mathematical Physics.  In so doing, he invented the math blog, and, quite possibly, the blog itself.  A lot of mathematicians of my generation found in John’s blog an accessible, informal, but never dumbed-down window beyond what we were learning in classes, into the messy and contentious ground of current research.  And everybody who blogs now owes him a gigantic debt.

In his honor I thought it would be a good idea to post a “This Week’s Finds” style post of my own, with capsule summaries of a few papers I’ve recently noted with pleasure and interest.  I won’t be able to weave these into a story the way John often did, though!  Nor will there be awesome ASCII graphics.  Nor will any of the papers actually be from this week, because I’m a little behind on my math.NT abstract scanning.

If you run a math blog, please consider doing the same in your own field!  I’ll link to it.

Update:  It begins!  Valeria de Palva offers This Week’s Finds In Categorical Logic.  And Matt Ward, a grad student at UW-Seattle, has This Week’s Finds in Arithmetic Geometry.

1)  “On sets defining few ordinary lines,” by Ben Green and Terry Tao.

The idea that has launched a thousand papers in additive combinatorics:  if you are a set approximately closed under some kind of relation, then you are approximately a set which is actually closed under that kind of relation.  Subset of a group mostly closed under multiplication?  You must be close to an honest subgroup.  Subset of Z with too many pair-sums agreeing?  You have an unusually large intersection with an authentic arithmetic progression.  And so on.

This new paper considers the case of sets in R^2 with few ordinary lines; that is, sets S such that most lines that intersect S at all intersect S in three or more points.  How can you cook up a set of points with this property?  There are various boring ways, like making all the points collinear.  But there’s only one interesting way I can think of:  have the points form an “arithmetic progression” …,-3P,-2P, -P, P,2P,3P, …. in an elliptic curve!  (A finite subgroup also works.)  Then the usual description of the group law on the curve tells us that the line joining two points of S quite often passes through a third.  Green and Tao prove a remarkable quasi-converse to this fact:  if a set has few ordinary lines, it must be concentrated on a cubic algebraic curve!  This might be my favorite “approximately structured implies approximates a structure” theorem yet.

2) “Asymptotic behavior of rational curves,” by David Bourqui.  Oh, I was about to start writing this but when I searched I realized I already blogged about this paper when it came out!  I leave this here because the paper is just as interesting now as it was then…

3) “The fluctuations in the number of points of smooth plane curves over finite fields,” by Alina Bucur, Chantal David, Brooke Feigon, and Matilde Lalin;

“The probability that a complete intersection is smooth,” by Alina Bucur and Kiran Kedlaya;

“The distribution of the number of points on trigonal curves over F_q,” by Melanie Matchett Wood;

“Semiample Bertini theorems over finite fields,” by Daniel Erman and Melanie Matchett Wood.

How many rational points does a curve over F_q have?  We discussed this question here a few years ago, coming to no clear conclusion.  I still maintain that if the curve is understood to vary over M_g(F_q), with q fixed and g growing, the problem is ridiculously hard.

But in more manageable families of curves, we now know a lot more than we did in 2008.

You might guess, of course, that the average number of points should be q+1; if you have to reason to think of Frobenius as biased towards having positive or negative trace, why not guess that the trace, on average, is 0?  Bucur-David-Feigon-Lalin prove that this is exactly the case for a random smooth plane curve.  It’s not hard to check that this holds for a random hyperelliptic curve as well.  But for a random trigonal curve, Wood proves that the answer is different — the average is slightly less than q+2!

Where did the extra point come from?

Here’s one way I like to think of it.  This is very vague, and proves nothing, of course.  The trigonal curve X has a degree-3 map to P^1, which is ramified at some divisor D in P^1.  If D is a random divisor, it has one F_q-point on average.  How many F_q-points on X lie over each rational point P of D?  Well, generically, the ramification is going to be simple, and this means that there are two rational points over D; the branch point, and the unique unramified point.  Over every other F_q-point of D, the Frobenius action on the preimage in X should be a random element of S_3, with an average of one fixed point.  To sum up, in expectation we should see q rational points of X over q non-branch rational points of P^1, and 2 rational points of X over a single rational branch point in P^1, for a total of q+2.

(Erman and Wood, in a paper released just a few months ago, prove much more general results of a similar flavor about smooth members of linear systems on P^1 x P^1 (or other Hirzebruch surfaces, or other varieties entirely) which are semiample; for instance, they may have a map to P^1 which stays constant in degree, while their intersection with another divisor gets larger and larger.)

Most mysterious of all is the theorem of Bucur and Kedlaya, which shows (among other things) that if X is a random smooth intersection of two hypersurfaces of large degree in P^3, then the size of |X(F_q)| is slightly less than q+1 on average.  For this phenomenon I have no heuristic explanation at all.  What’s keeping the points away?

## Some visitors, and countable unions

A busy few days: we had a run of interesting visitors this week in Madison, including Thomas Lam, who gave a beautiful talk about total positivity (a subfield of algebraic combinatorics, not a self-help philosophy); Melanie Matchett Wood, who explained how to parametrize binary forms of degree n in the Bhargava style, not only over Z but over an arbitrary base scheme (which is to say, not really in the Bhargava style!); and Davesh Maulik, who showed us how one can rather miraculously count rational curves on a single K3 by counting rational curves on a suitably chosen one-parameter family of K3s, and then “dividing” by the Noether-Lefschetz theory attached to the family. Very agreeably for number theorists, a key point is the product formulae of Borcherds, which provide modular forms on the moduli space of K3s whose zeroes and poles are supported on the countable union of subvarieties where the Picard number jumps upwards from its generic value.

This led to an amusing conversation at lunch about countable unions of subvarieties. Here’s a remark: if A is an abelian variety over the complex numbers, it’s completely obvious that A(C) contains some non-torsion points; the torsion locus is a countable union of varieties of strictly lower dimension (in this case 0) and thus can’t cover A(C). On the other hand, if A is over Fpbar, every point of A(Fpbar) is defined over some finite field, and thus all these points are torsion. The case of Qbar is intermediate in difficulty; indeed, there are nontorsion points on every abelian variety over Qbar, but this is not, in some sense, by “pure thought” — one might, for instance, use the argument that torsion points have height 0 but that there are plainly points of arbitrarily large height on A(Qbar). This uses some actual theorems, not just a comparison of cardinalities. Similarly, one can ask: are there elliptic curves over an algebraically closed field k with End(E/k) = Z? When k = C, the answer is obviously yes. When k = Fpbar, the answer is no, thanks to Frobenius. And when k is Qbar, the answer is again no, but maybe one has to use a bit more — for instance, that a CM elliptic curve over a number field has potentially good reduction everywhere.

In general, it’s pretty hard to see whether a countable union of subvarieties of X/Qbar covers all the Qbar-points! Here are two well-known open questions in this vein.