## Pila on a “modular Fermat equation”

I like this paper by Pila that just went up on the arXiv, which shows the way that you can get Diophantine consequences from the rapid progress being made in theorems of Andre-Oort type.  (I also want to blog about Tsimerman + Zhang + Yuan on “average Colmez” and Andre-Oort, maybe later!)

Pila shows that if N and M are sufficiently large primes, you can’t have elliptic curves E_1/Q and E_2/Q such that E_1 has an N-isogenous curve E_1 -> E’_1, E_2 has an M-isogenous curve E_2 -> E’_2, and j(E’_1) + j(E’_2) = 1.  (It seems to me the proof uses little about this particular algebraic relation and would work just as well for any f(j(E’_1),j(E’_2)) whose vanishing didn’t cut out a modular curve in X(1) x X(1).)  (This is “Fermat-like” in that it asserts finiteness of rational points on a natural countable family of high-genus curves; a more precise analogy is explained in the paper.)

How this works, loosely:  suppose you have such an (E_1, E_2).  A theorem of Kühne guarantees that E_1 and E_2 are not both CM (I didn’t know this!) It follows (WLOG assume N > M) that the N-isogenies of E_1 are defined over a field of degree at least N^a for some small a (Pila uses more precise bounds coming from a recent paper of Najman.)  So the Galois conjugates of (E’_1, E’_2) give you a whole bunch of algebraic points (E”_1, E”_2) with j(E”_1) + j(E”_2) = 1.

So what?  Rational curves have lots of low-height algebraic points.  But here’s the thing.  These isogenous choices of (E’_1, E’_2) aren’t just any algebraic points on X(1) x X(1); they represent pairs of elliptic curves drawn from a {\em fixed pair of isogeny classes}.  Let H be the hyperbolic plane as usual, and write (z,w) for a point on H x H corresponding to (E’_1, E’_2).  Then the other choices (E”_1, E”_2) correspond to points (gz,hw) with g,h in GL(Q).  GL(Q), not GL(R)!  That’s what we get from working in a fixed isogeny class.  And these points satisfy

j(gz) + j(hw) = 1.

To sum up:  you have a whole bunch of rational points (g,h) on GL_2 x GL_2.  These points are pretty low height (for this Pila gestures at a paper of his with Habegger.)  And they lie on the surface j(gz) + j(hw) = 1.  But this surface is a totally non-algebraic thing, because remember, j is a transcendental function on H!  So (Pila’s version of) the Ax-Lindemann theorem (correction from comments:  the Pila-Wilkie theorem) generates a contradiction; a transcendental curve can’t have too many low-height rational points.

## Shende and Tsimerman on equidistribution in Bun_2(P^1)

Very nice paper just posted by Vivek Shende and Jacob Tsimerman.  Take a sequence {C_i} of hyperelliptic curves of larger and larger genus.  Then for each i, you can look at the pushforward of a random line bundle drawn uniformly from Pic(C) / [pullbacks from P^1] to P^1, which is a rank-2 vector bundle.  This gives you a measure $\mu_i$ on Bun_2(P^1), the space of rank-2 vector bundles, and Shende and Tsimerman prove, just as you might hope, that this sequence of measures converges to the natural measure.

I think (but I didn’t think this through carefully) that this corresponds to saying that if you look at a sequence of quadratic imaginary fields with increasing discriminant, and for each field you write down all the ideal classes, thought of as unimodular lattices in R^2 up to homothety, then the corresponding sequence of (finitely supported) measures on the space of lattices converges to the natural one.

Equidistribution comes down to counting, and the method here is to express the relevant counting problem as a problem of counting points on a variety (in this case a Brill-Noether locus inside Pic(C_i)), which by Grothendieck-Lefschetz you can do if you can control the cohomology (with its Frobenius action.)  The high-degree part of the cohomology they can describe explicitly, and fortunately they are able to exert enough control over the low-degree Betti numbers to show that the contribution of this stuff is negligible.

In my experience, it’s often the case that showing that the contribution of the low-degree stuff, which “should be small” but which you don’t actually have a handle on, is often the bottleneck!  And indeed, for the second problem they discuss (where you have a sequence of hyperelliptic curves and a single line bundle on each one) it is exactly this point that stops them, for the moment, from having the theorem they want.

Error terms are annoying.  (At least when you can’t prove they’re smaller than the main term.)

## Y. Zhao and the Roberts conjecture over function fields

Before the developments of the last few years the only thing that was known about the Cohen-Lenstra conjecture was what had already been known before the Cohen-Lenstra conjecture; namely, that the number of cubic fields of discriminant between -X and X could be expressed as

$\frac{1}{3\zeta(3)} X + o(X)$.

It isn’t hard to go back and forth between the count of cubic fields and the average size of the 3-torsion part of the class group of quadratic fields, which gives the connection with Cohen-Lenstra in its usual form.

Anyway, Datskovsky and Wright showed that the asymptotic above holds (for suitable values of 12) over any global field of characteristic at least 5.  That is:  for such a field K, you let N_K(X) be the number of cubic extensions of K whose discriminant has norm at most X; then

$N_K(X) = c_K \zeta_K(3)^{-1} X + o(X)$

for some explicit rational constant $c_K$.

One interesting feature of this theorem is that, if it weren’t a theorem, you might doubt it was true!  Because the agreement with data is pretty poor.  That’s because the convergence to the Davenport-Heilbronn limit is extremely slow; even if you let your discriminant range up to ten million or so, you still see substantially fewer cubic fields than you’re supposed to.

In 2000, David Roberts massively clarified the situation, formulating a conjectural refinement of the Davenport-Heilbronn theorem motivated by the Shintani zeta functions:

$N_{\mathbf{Q}}(X) = (1/3)\zeta(3)^{-1} X + c X^{5/6} + o(X^{5/6})$

with c an explicit (negative) constant.  The secondary term with an exponent very close to 1 explains the slow convergence to the Davenport-Heilbronn estimate.

The Datskovsky-Wright argument works over an arbitrary global field but, like most arguments that work over both number fields and function fields, it is not very geometric.  I asked my Ph.D. student Yongqiang Zhao, who’s finishing this year, to revisit the question of counting cubic extensions of a function field F_q(t) from a more geometric point of view to see if he could get results towards the Roberts conjecture.  And he did!  Which is what I want to tell you about.

But while Zhao was writing his thesis, there was a big development — the Roberts conjecture was proved.  Not only that — it was proved twice!  Once by Bhargava, Shankar, and Tsimerman, and once by Thorne and Taniguchi, independently, simultaneously, and using very different methods.  It is certainly plausible that these methods can give the Roberts conjecture over function fields, but at the moment, they don’t.

Neither does Zhao, yet — but he’s almost there, getting

$N_K(T) = \zeta_K(3)^{-1} X + O(X^{5/6 + \epsilon})$

for all rational function fields K = F_q(t) of characteristic at least 5.  And his approach illuminates the geometry of the situation in a very beautiful way, which I think sheds light on how things work in the number field case.

Geometrically speaking, to count cubic extensions of F_q(t) is to count trigonal curves over F_q.  And the moduli space of trigonal curves has a classical unirational parametrization, which I learned from Mike Roth many years ago:  given a trigonal curve Y, you push forward the structure sheaf along the degree-3 map to P^1, yielding a rank-3 vector bundle on P^1; you mod out by the natural copy of the structure sheaf; and you end up with a rank-2 vector bundle W on P^1, whose projectivization is a rational surface in which Y embeds.  This rational surface is a Hirzebruch surface F_k, where k is an integer determined by the isomorphism class of the vector bundle W.  (This story is the geometric version of the Delone-Fadeev parametrization of cubic rings by binary cubic forms.)

This point of view replaces a problem of counting isomorphism classes of curves (hard!) with a problem of counting divisors in surfaces (not easy, but easier.)  It’s not hard to figure out what linear system on F_k contains Y.  Counting divisors in a linear system is nothing but a dimension count, but you have to be careful — in this problem, you only want to count smooth members.  That’s a substantially more delicate problem.  Counting all the divisors is more or less the problem of counting all cubic rings; that problem, as the number theorists have long known, is much easier than the problem of counting just the maximal orders in cubic fields.

Already, the geometric meaning of the negative secondary term becomes quite clear; it turns out that when k is big enough (i.e. if the Hirzebruch surface is twisty enough) then the corresponding linear system has no smooth, or even irreducible, members!  So what “ought” to be a sum over all k is rudely truncated; and it turns out that the sum over larger k that “should have been there” is on order X^{5/6}.

So how do you count the smooth members of a linear system?  When the linear system is highly ample, this is precisely the subject of Poonen’s well-known “Bertini theorem over finite fields.”  But the trigonal linear systems aren’t like that; they’re only “semi-ample,” because their intersection with the fiber of projection F_k -> P^1 is fixed at 3.  Zhao shows that, just as in Poonen’s case, the probability that a member of such a system is smooth converges to a limit as the linear system gets more complicated; only this limit is computed, not as a product over points P of the probability D is smooth at P, but rather a product over fibers F of the probability that D is smooth along F.  (This same insight, arrived at independently, is central to the paper of Erman and Wood I mentioned last week.)

This alone is enough for Zhao to get a version of Davenport-Heilbronn over F_q(t) with error term O(X^{7/8}), better than anything that was known for number fields prior to last year.  How he gets even closer to Roberts is too involved to go into on the blog, but it’s the best part, and it’s where the algebraic geometry really starts; the main idea is a very careful analysis of what happens when you take a singular curve on a Hirzebruch surface and start carrying out elementary transforms at the singular points, making your curve more smooth but also changing which Hirzebruch surface it’s on!

To what extent is Zhao’s method analogous to the existing proofs of the Roberts conjecture over Q?  I’m not sure; though Zhao, together with the five authors of the two papers I mentioned, spent a week huddling at AIM thinking about this, and they can comment if they want.

I’ll just keep saying what I always say:  if a problem in arithmetic statistics over Q is interesting, there is almost certainly interesting algebraic geometry in the analogous problem over F_q(t), and the algebraic geometry is liable in turn to offer some insights into the original question.

## Hwang and To on injectivity radius and gonality, and “Typical curves are not typical.”

Interesting new paper in the American Journal of Mathematics, not on arXiv unfortunately.  An old theorem of Li and Yau shows how to lower-bound the gonality of a Riemann surface in terms of the spectral gap on its Laplacian; this (together with new theorems by many people on superstrong approximation for thin groups) is what Chris Hall, Emmanuel Kowalski, and I used to give lower bounds on gonalities in various families of covers of a fixed base.

The new paper gives a lower bound for the gonality of a compact Riemann surface in terms of the injectivity radius, which is half the length of the shortest closed geodesic loop.  You could think of it like this — they show that the low-gonality loci in M_g stay very close to the boundary.

“The middle” of M_g is a mysterious place.  A “typical” curve of genus g has a big spectral gap, gonality on order g/2, a big injectivity radius…  but most curves you can write down are just the opposite.

Typical curves are not typical.

When g is large, M_g is general type, and so the generic curve doesn’t move in a rational family.  Are all the rational families near the boundary?  Gaby Farkas explained to me on Math Overflow how to construct a rationally parametrized family of genus-g curves whose gonality is generic, as a pencil of curves on a K3 surface.  I wonder how “typical” these curves are?  Do some have large injectivity radius?  Or a large spectral gap?

## Random Dieudonne modules, random p-divisible groups, and random curves over finite fields

Bryden Cais, David Zureick-Brown and I have just posted a new paper,  “Random Dieudonne modules, random p-divisible groups, and random curves over finite fields.”

What’s the main idea?  It actually arose from a question David Bryden asked during Derek Garton‘s speciality exam.  We know by now that there is some insight to be gained about studying p-parts of class groups of number fields (the Cohen-Lenstra problem) by thinking about the analogous problem of studying class groups of function fields over F_l, where F_l has characteristic prime to p.

The question David asked was:  well, what about the p-part of the class group of a function field whose characteristic is equal to p?

That’s a different matter altogether.  The p-divisible group attached to the Jacobian of a curve C in characteristic l doesn’t contain very much information;  more or less it’s just a generalized symplectic matrix of rank 2g(C), defined up to conjugacy, and the Cohen-Lenstra heuristics ask this matrix to behave like a random matrix with respect to various natural statistics.

But p-divisible groups in characteristic p are where the fun is!  For instance, you can ask:

What is the probability that a random curve (resp. random hyperelliptic curve, resp. random plane curve, resp. random abelian variety) over F_q is ordinary?

In my view it’s sort of weird that nobody has asked this before!  But as far as I’ve been able to tell, this is the first time the question has been considered.

We generate lots of data, some of which is very illustrative and some of which is (to us) mysterious.  But data alone is not that useful — much better to have a heuristic model with which we can compare the data.  Setting up such a model is the main task of the paper.  Just as a p-divisible group in characteristic l is decribed by a matrix, a p-divisible group in characteristic p is described by its Dieudonné module;  this is just another linear-algebraic gadget, albeit a little more complicated than a matrix.  But it turns out there is a natural “uniform distribution” on isomorphism classes of  Dieudonné modules; we define this, work out its properties, and see what it would say about curves if indeed their Dieudonné modules were “random” in the sense of being drawn from this distribution.

To some extent, the resulting heuristics agree with data.  But in other cases, they don’t.  For instance:  the probability that a hyperelliptic curve of large genus over F_3 is ordinary appears in practice to be very close to 2/3.  But the probability that a smooth plane curve of large genus over F_3 is ordinary seems to be converging to the probability that a random Dieudonné module over F_3 is ordinary, which is

(1-1/3)(1-1/3^3)(1-1/3^5)….. = 0.639….

Why?  What makes hyperelliptic curves over F_3 more often ordinary than their plane curve counterparts?

(Note that the probability of ordinarity, which makes good sense for those who already know Dieudonné modules well, is just the probability that two random maximal isotropic subspaces of a symplectic space over F_q are disjoint.  So some of the computations here are in some sense the “symplectic case” of what Poonen and Rains computed in the orthogonal case.

We compute lots more stuff (distribution of a-numbers, distribution of p-coranks, etc.) and decline to compute a lot more (distribution of Newton polygon, final type…)  Many interesting questions remain!

## Gonality, the Bogomolov property, and Habegger’s theorem on Q(E^tors)

I promised to say a little more about why I think the result of Habegger’s recent paper, ” Small Height and Infinite Non-Abelian Extensions,” is so cool.

First of all:  we say an algebraic extension K of Q has the Bogomolov property if there is no infinite sequence of non-torsion elements x in K^* whose absolute logarithmic height tends to 0.  Equivalently, 0 is isolated in the set of absolute heights in K^*.  Finite extensions of Q evidently have the Bogomolov property (henceforth:  (B)) but for infinite extensions the question is much subtler.  Certainly $\bar{\mathbf{Q}}$ itself doesn’t have (B):  consider the sequence $2^{1/2}, 2^{1/3}, 2^{1/4}, \ldots$  On the other hand, the maximal abelian extension of Q is known to have (B) (Amoroso-Dvornicich) , as is any extension which is totally split at some fixed place p (Schinzel for the real prime, Bombieri-Zannier for the other primes.)

Habegger has proved that, when E is an elliptic curve over Q, the field Q(E^tors) obtained by adjoining all torsion points of E has the Bogomolov property.

What does this have to do with gonality, and with my paper with Chris Hall and Emmanuel Kowalski from last year?

Suppose we ask about the Bogomolov property for extensions of a more general field F?  Well, F had better admit a notion of absolute Weil height.  This is certainly OK when F is a global field, like the function field of a curve over a finite field k; but in fact it’s fine for the function field of a complex curve as well.  So let’s take that view; in fact, for simplicity, let’s take F to be C(t).

What does it mean for an algebraic extension F’ of F to have the Bogomolov property?  It means that there is a constant c such that, for every finite subextension L of F and every non-constant function x in L^*, the absolute logarithmic height of x is at least c.

Now L is the function field of some complex algebraic curve C, a finite cover of P^1.  And a non-constant function x in L^* can be thought of as a nonzero principal divisor.  The logarithmic height, in this context, is just the number of zeroes of x — or, if you like, the number of poles of x — or, if you like, the degree of x, thought of as a morphism from C to the projective line.  (Not necessarily the projective line of which C is a cover — a new projective line!)  In the number field context, it was pretty easy to see that the log height of non-torsion elements of L^* was bounded away from 0.  That’s true here, too, even more easily — a non-constant map from C to P^1 has degree at least 1!

There’s one convenient difference between the geometric case and the number field case.  The lowest log height of a non-torsion element of L^* — that is, the least degree of a non-constant map from C to P^1 — already has a name.  It’s called the gonality of C.  For the Bogomolov property, the relevant number isn’t the log height, but the absolute log height, which is to say the gonality divided by [L:F].

So the Bogomolov property for F’ — what we might call the geometric Bogomolov property — says the following.  We think of F’ as a family of finite covers C / P^1.  Then

(GB)  There is a constant c such that the gonality of C is at least c deg(C/P^1), for every cover C in the family.

What kinds of families of covers are geometrically Bogomolov?  As in the number field case, you can certainly find some families that fail the test — for instance, gonality is bounded above in terms of genus, so any family of curves C with growing degree over P^1 but bounded genus will do the trick.

On the other hand, the family of modular curves over X(1) is geometrically Bogomolov; this was proved (independently) by Abramovich and Zograf.  This is a gigantic and elegant generalization of Ogg’s old theorem that only finitely many modular curves are hyperelliptic (i.e. only finitely many have gonality 2.)

At this point we have actually more or less proved the geometric version of Habegger’s theorem!  Here’s the idea.  Take F = C(t) and let E/F be an elliptic curve; then to prove that F(E(torsion)) has (GB), we need to give a lower bound for the curve C_N obtained by adjoining an N-torsion point to F.  (I am slightly punting on the issue of being careful about other fields contained in F(E(torsion)), but I don’t think this matters.)  But C_N admits a dominant map to X_1(N); gonality goes down in dominant maps, so the Abramovich-Zograf bound on the gonality of X_1(N) provides a lower bound for the gonality of C_N, and it turns out that this gives exactly the bound required.

What Chris, Emmanuel and I proved is that (GB) is true in much greater generality — in fact (using recent results of Golsefidy and Varju that slightly postdate our paper) it holds for any extension of C(t) whose Galois group is a perfect Lie group with Z_p or Zhat coefficients and which is ramified at finitely many places; not just the extension obtained by adjoining torsion of an elliptic curve, for instance, but the one you get from the torsion of an abelian variety of arbitrary dimension, or for that matter any other motive with sufficiently interesting Mumford-Tate group.

Question:   Is Habegger’s theorem true in this generality?  For instance, if A/Q is an abelian variety, does Q(A(tors)) have the Bogomolov property?

Question:  Is there any invariant of a number field which plays the role in the arithmetic setting that “spectral gap of the Laplacian” plays for a complex algebraic curve?

A word about Habegger’s proof.  We know that number fields are a lot more like F_q(t) than they are like C(t).  And the analogue of the Abramovich-Zograf bound for modular curves over F_q is known as well, by a theorem of Poonen.  The argument is not at all like that of Abramovich and Zograf, which rests on analysis in the end.  Rather, Poonen observes that modular curves in characteristic p have lots of supersingular points, because the square of Frobenius acts as a scalar on the l-torsion in the supersingular case.  But having a lot of points gives you a lower bound on gonality!  A curve with a degree d map to P^1 has at most d(q+1) points, just because the preimage of each of the q+1 points of P^1(q) has size at most d.  (You just never get too old or too sophisticated to whip out the Pigeonhole Principle at an opportune moment….)

Now I haven’t studied Habegger’s argument in detail yet, but look what you find right in the introduction:

The non-Archimedean estimate is done at places above an auxiliary prime number p where E has good supersingular reduction and where some other technical conditions are met…. In this case we will obtain an explicit height lower bound swiftly using the product formula, cf. Lemma 5.1. The crucial point is that supersingularity forces the square of the Frobenius to act as a scalar on the reduction of E modulo p.

Yup!  There’s no mention of Poonen in the paper, so I think Habegger came to this idea independently.  Very satisfying!  The hard case — for Habegger as for Poonen — has to do with the fields obtained by adjoining p-torsion, where p is the characteristic of the supersingular elliptic curve driving the argument.  It would be very interesting to hear from Poonen and/or Habegger whether the arguments are similar in that case too!

## There’s no 4-branched Belyi’s theorem — right?

Much discussion on Math Overflow has not resolved the following should-be-easy question:

Give an example of a curve in ${\mathcal{M}}_g$ defined over $\bar{Q}$ which is not a family of 4-branched covers of P^1.

Surely there is one!  But then again, you’d probably say “surely there’s a curve over $\bar{Q}$ which isn’t a 3-branched cover of P^1.”  But there isn’t — that’s Belyi’s theorem.

## Hain-Matsumoto, “Galois actions on fundamental groups of curves…”

I recently had occasion to spend some time with Richard Hain and Makoto Matsumoto’s 2005 paper “Galois actions on fundamental groups and the cycle C – C^-,” which I’d always meant to delve into.  It’s really beautiful!  I cannot say I’ve really delved — maybe something more like scratched — but I wanted to share some very interesting things I learned.

Serre proved long ago that the image of the l-adic Galois representation on an elliptic curve E/Q is open in GL_2(Z_l), so long as E doesn’t have CM.  This is a geometric condition on E, which is to say it only depends on the basechange of E to an algebraic closure of Q, or even to C.

What’s the analogue for higher genus curves X?  You might start by asking about the image of the Galois representation G_Q -> GSp_2g(Z_l) attached to the Tate module of the Jacobian of X.  This image lands in GSp_{2g}(Z_l).  Just as with elliptic curves, any extra endomorphisms of Jac(X) may force the image to be much smaller than GSp_{2g}(Z_l).  But the question of whether the image of rho must be open in GSp_2g(Z_l) whenever no “obvious” geometric obstruction forbids it is difficult, and still not completely understood.  (I believe it’s still unknown when g is a multiple of 4…?)  One thing we do know in general, though, is that when X is the generic curve of genus g (that is, the universal curve over the function field Q(M_g) of M_g) the resulting representation

$\rho^{univ}: G_{Q(M_g)} \rightarrow GSp_{2g}(\mathbf{Z}_\ell)$

is surjective.

Hain and Matsumoto generalize in a different direction.  When X is a curve of genus greater than 1 over a field K, the Galois group of K acts on more than just the Tate modules (or l-adic H_1) of X; it acts on the whole pro-l geometric fundamental group of X, which we denote pi.  So we get a morphism

$\rho_{X/K}: G_K \rightarrow Aut(\pi)$

What does it mean to ask this representation to have “big image”?

## Do all curves over finite fields have covers with a sqrt(q) eigenvalue?

On my recent visit to Illinois, my colleage Nathan Dunfield (now blogging!) explained to me the following interesting open question, whose answer is supposed to be “yes”:

Q1: Let f be a pseudo-Anosov mapping class on a Riemann surface Sigma of genus at least 2, and M_f the mapping cylinder obtained by gluing the two ends of Sigma x interval together by means of f.  Then M_f is a hyperbolic 3-manifold with first Betti number 1.  Is there a finite cover M of M_f with b_1(M) > 1?

You might think of this as (a special case of) a sort of “relative virtual positive Betti number conjecture.”  The usual vpBnC says that a 3-manifold has a finite cover with positive Betti number; this says that when your manifold starts life with Betti number 1, you can get “extra” first homology by passing to a cover.

Of course, when I see “3-manifold fibered over the circle” I whip out a time-worn analogy and think “algebraic curve over a finite field.”  So here’s the number theorist’s version of the above question:

Q2: Let X/F_q be an algebraic curve of genus at least 2 over a finite field.  Does X have a finite etale cover Y/F_{q^d} such that the action of Frobenius on H^1(Y,Z_ell) has an eigenvalue equal to q^{d/2}?