## Shende and Tsimerman on equidistribution in Bun_2(P^1)

Very nice paper just posted by Vivek Shende and Jacob Tsimerman.  Take a sequence {C_i} of hyperelliptic curves of larger and larger genus.  Then for each i, you can look at the pushforward of a random line bundle drawn uniformly from Pic(C) / [pullbacks from P^1] to P^1, which is a rank-2 vector bundle.  This gives you a measure $\mu_i$ on Bun_2(P^1), the space of rank-2 vector bundles, and Shende and Tsimerman prove, just as you might hope, that this sequence of measures converges to the natural measure.

I think (but I didn’t think this through carefully) that this corresponds to saying that if you look at a sequence of quadratic imaginary fields with increasing discriminant, and for each field you write down all the ideal classes, thought of as unimodular lattices in R^2 up to homothety, then the corresponding sequence of (finitely supported) measures on the space of lattices converges to the natural one.

Equidistribution comes down to counting, and the method here is to express the relevant counting problem as a problem of counting points on a variety (in this case a Brill-Noether locus inside Pic(C_i)), which by Grothendieck-Lefschetz you can do if you can control the cohomology (with its Frobenius action.)  The high-degree part of the cohomology they can describe explicitly, and fortunately they are able to exert enough control over the low-degree Betti numbers to show that the contribution of this stuff is negligible.

In my experience, it’s often the case that showing that the contribution of the low-degree stuff, which “should be small” but which you don’t actually have a handle on, is often the bottleneck!  And indeed, for the second problem they discuss (where you have a sequence of hyperelliptic curves and a single line bundle on each one) it is exactly this point that stops them, for the moment, from having the theorem they want.

Error terms are annoying.  (At least when you can’t prove they’re smaller than the main term.)

## Elliptic curves with isomorphic cyclic 13-subgroups?

I liked this MathOverflow question, which asks:  are there two non-isogenous elliptic curves over Q, each one of which has a rational cyclic 13-isogeny, and such that the kernels of the two isogenies are isomorphic as Galois modules?

This is precisely to look for rational points on the modular surface S parametrizing pairs (E,E’,C,C’,φ), where E and E’ are elliptic curves, C and C’ are cyclic 13-subgroups, and φ is an isomorphism between C and C’.

S is a quotient of X_1(13) x X_1(13) by the diagonal in the natural (Z/13Z)^* x (Z/13Z)^* action.

Is S general type, rational, what?

## Y. Zhao and the Roberts conjecture over function fields

Before the developments of the last few years the only thing that was known about the Cohen-Lenstra conjecture was what had already been known before the Cohen-Lenstra conjecture; namely, that the number of cubic fields of discriminant between -X and X could be expressed as

$\frac{1}{3\zeta(3)} X + o(X)$.

It isn’t hard to go back and forth between the count of cubic fields and the average size of the 3-torsion part of the class group of quadratic fields, which gives the connection with Cohen-Lenstra in its usual form.

Anyway, Datskovsky and Wright showed that the asymptotic above holds (for suitable values of 12) over any global field of characteristic at least 5.  That is:  for such a field K, you let N_K(X) be the number of cubic extensions of K whose discriminant has norm at most X; then

$N_K(X) = c_K \zeta_K(3)^{-1} X + o(X)$

for some explicit rational constant $c_K$.

One interesting feature of this theorem is that, if it weren’t a theorem, you might doubt it was true!  Because the agreement with data is pretty poor.  That’s because the convergence to the Davenport-Heilbronn limit is extremely slow; even if you let your discriminant range up to ten million or so, you still see substantially fewer cubic fields than you’re supposed to.

In 2000, David Roberts massively clarified the situation, formulating a conjectural refinement of the Davenport-Heilbronn theorem motivated by the Shintani zeta functions:

$N_{\mathbf{Q}}(X) = (1/3)\zeta(3)^{-1} X + c X^{5/6} + o(X^{5/6})$

with c an explicit (negative) constant.  The secondary term with an exponent very close to 1 explains the slow convergence to the Davenport-Heilbronn estimate.

The Datskovsky-Wright argument works over an arbitrary global field but, like most arguments that work over both number fields and function fields, it is not very geometric.  I asked my Ph.D. student Yongqiang Zhao, who’s finishing this year, to revisit the question of counting cubic extensions of a function field F_q(t) from a more geometric point of view to see if he could get results towards the Roberts conjecture.  And he did!  Which is what I want to tell you about.

But while Zhao was writing his thesis, there was a big development — the Roberts conjecture was proved.  Not only that — it was proved twice!  Once by Bhargava, Shankar, and Tsimerman, and once by Thorne and Taniguchi, independently, simultaneously, and using very different methods.  It is certainly plausible that these methods can give the Roberts conjecture over function fields, but at the moment, they don’t.

Neither does Zhao, yet — but he’s almost there, getting

$N_K(T) = \zeta_K(3)^{-1} X + O(X^{5/6 + \epsilon})$

for all rational function fields K = F_q(t) of characteristic at least 5.  And his approach illuminates the geometry of the situation in a very beautiful way, which I think sheds light on how things work in the number field case.

Geometrically speaking, to count cubic extensions of F_q(t) is to count trigonal curves over F_q.  And the moduli space of trigonal curves has a classical unirational parametrization, which I learned from Mike Roth many years ago:  given a trigonal curve Y, you push forward the structure sheaf along the degree-3 map to P^1, yielding a rank-3 vector bundle on P^1; you mod out by the natural copy of the structure sheaf; and you end up with a rank-2 vector bundle W on P^1, whose projectivization is a rational surface in which Y embeds.  This rational surface is a Hirzebruch surface F_k, where k is an integer determined by the isomorphism class of the vector bundle W.  (This story is the geometric version of the Delone-Fadeev parametrization of cubic rings by binary cubic forms.)

This point of view replaces a problem of counting isomorphism classes of curves (hard!) with a problem of counting divisors in surfaces (not easy, but easier.)  It’s not hard to figure out what linear system on F_k contains Y.  Counting divisors in a linear system is nothing but a dimension count, but you have to be careful — in this problem, you only want to count smooth members.  That’s a substantially more delicate problem.  Counting all the divisors is more or less the problem of counting all cubic rings; that problem, as the number theorists have long known, is much easier than the problem of counting just the maximal orders in cubic fields.

Already, the geometric meaning of the negative secondary term becomes quite clear; it turns out that when k is big enough (i.e. if the Hirzebruch surface is twisty enough) then the corresponding linear system has no smooth, or even irreducible, members!  So what “ought” to be a sum over all k is rudely truncated; and it turns out that the sum over larger k that “should have been there” is on order X^{5/6}.

So how do you count the smooth members of a linear system?  When the linear system is highly ample, this is precisely the subject of Poonen’s well-known “Bertini theorem over finite fields.”  But the trigonal linear systems aren’t like that; they’re only “semi-ample,” because their intersection with the fiber of projection F_k -> P^1 is fixed at 3.  Zhao shows that, just as in Poonen’s case, the probability that a member of such a system is smooth converges to a limit as the linear system gets more complicated; only this limit is computed, not as a product over points P of the probability D is smooth at P, but rather a product over fibers F of the probability that D is smooth along F.  (This same insight, arrived at independently, is central to the paper of Erman and Wood I mentioned last week.)

This alone is enough for Zhao to get a version of Davenport-Heilbronn over F_q(t) with error term O(X^{7/8}), better than anything that was known for number fields prior to last year.  How he gets even closer to Roberts is too involved to go into on the blog, but it’s the best part, and it’s where the algebraic geometry really starts; the main idea is a very careful analysis of what happens when you take a singular curve on a Hirzebruch surface and start carrying out elementary transforms at the singular points, making your curve more smooth but also changing which Hirzebruch surface it’s on!

To what extent is Zhao’s method analogous to the existing proofs of the Roberts conjecture over Q?  I’m not sure; though Zhao, together with the five authors of the two papers I mentioned, spent a week huddling at AIM thinking about this, and they can comment if they want.

I’ll just keep saying what I always say:  if a problem in arithmetic statistics over Q is interesting, there is almost certainly interesting algebraic geometry in the analogous problem over F_q(t), and the algebraic geometry is liable in turn to offer some insights into the original question.

## This Week’s Finds In Number Theory

Twenty years ago yesterday, John Baez posted the first installment of This Week’s Finds in Mathematical Physics.  In so doing, he invented the math blog, and, quite possibly, the blog itself.  A lot of mathematicians of my generation found in John’s blog an accessible, informal, but never dumbed-down window beyond what we were learning in classes, into the messy and contentious ground of current research.  And everybody who blogs now owes him a gigantic debt.

In his honor I thought it would be a good idea to post a “This Week’s Finds” style post of my own, with capsule summaries of a few papers I’ve recently noted with pleasure and interest.  I won’t be able to weave these into a story the way John often did, though!  Nor will there be awesome ASCII graphics.  Nor will any of the papers actually be from this week, because I’m a little behind on my math.NT abstract scanning.

If you run a math blog, please consider doing the same in your own field!  I’ll link to it.

Update:  It begins!  Valeria de Palva offers This Week’s Finds In Categorical Logic.  And Matt Ward, a grad student at UW-Seattle, has This Week’s Finds in Arithmetic Geometry.

1)  “On sets defining few ordinary lines,” by Ben Green and Terry Tao.

The idea that has launched a thousand papers in additive combinatorics:  if you are a set approximately closed under some kind of relation, then you are approximately a set which is actually closed under that kind of relation.  Subset of a group mostly closed under multiplication?  You must be close to an honest subgroup.  Subset of Z with too many pair-sums agreeing?  You have an unusually large intersection with an authentic arithmetic progression.  And so on.

This new paper considers the case of sets in R^2 with few ordinary lines; that is, sets S such that most lines that intersect S at all intersect S in three or more points.  How can you cook up a set of points with this property?  There are various boring ways, like making all the points collinear.  But there’s only one interesting way I can think of:  have the points form an “arithmetic progression” …,-3P,-2P, -P, P,2P,3P, …. in an elliptic curve!  (A finite subgroup also works.)  Then the usual description of the group law on the curve tells us that the line joining two points of S quite often passes through a third.  Green and Tao prove a remarkable quasi-converse to this fact:  if a set has few ordinary lines, it must be concentrated on a cubic algebraic curve!  This might be my favorite “approximately structured implies approximates a structure” theorem yet.

2) “Asymptotic behavior of rational curves,” by David Bourqui.  Oh, I was about to start writing this but when I searched I realized I already blogged about this paper when it came out!  I leave this here because the paper is just as interesting now as it was then…

3) “The fluctuations in the number of points of smooth plane curves over finite fields,” by Alina Bucur, Chantal David, Brooke Feigon, and Matilde Lalin;

“The probability that a complete intersection is smooth,” by Alina Bucur and Kiran Kedlaya;

“The distribution of the number of points on trigonal curves over F_q,” by Melanie Matchett Wood;

“Semiample Bertini theorems over finite fields,” by Daniel Erman and Melanie Matchett Wood.

How many rational points does a curve over F_q have?  We discussed this question here a few years ago, coming to no clear conclusion.  I still maintain that if the curve is understood to vary over M_g(F_q), with q fixed and g growing, the problem is ridiculously hard.

But in more manageable families of curves, we now know a lot more than we did in 2008.

You might guess, of course, that the average number of points should be q+1; if you have to reason to think of Frobenius as biased towards having positive or negative trace, why not guess that the trace, on average, is 0?  Bucur-David-Feigon-Lalin prove that this is exactly the case for a random smooth plane curve.  It’s not hard to check that this holds for a random hyperelliptic curve as well.  But for a random trigonal curve, Wood proves that the answer is different — the average is slightly less than q+2!

Where did the extra point come from?

Here’s one way I like to think of it.  This is very vague, and proves nothing, of course.  The trigonal curve X has a degree-3 map to P^1, which is ramified at some divisor D in P^1.  If D is a random divisor, it has one F_q-point on average.  How many F_q-points on X lie over each rational point P of D?  Well, generically, the ramification is going to be simple, and this means that there are two rational points over D; the branch point, and the unique unramified point.  Over every other F_q-point of D, the Frobenius action on the preimage in X should be a random element of S_3, with an average of one fixed point.  To sum up, in expectation we should see q rational points of X over q non-branch rational points of P^1, and 2 rational points of X over a single rational branch point in P^1, for a total of q+2.

(Erman and Wood, in a paper released just a few months ago, prove much more general results of a similar flavor about smooth members of linear systems on P^1 x P^1 (or other Hirzebruch surfaces, or other varieties entirely) which are semiample; for instance, they may have a map to P^1 which stays constant in degree, while their intersection with another divisor gets larger and larger.)

Most mysterious of all is the theorem of Bucur and Kedlaya, which shows (among other things) that if X is a random smooth intersection of two hypersurfaces of large degree in P^3, then the size of |X(F_q)| is slightly less than q+1 on average.  For this phenomenon I have no heuristic explanation at all.  What’s keeping the points away?

## Idle question: cluster algebras over finite fields and spectral gaps

Yet another great talk at the JMM:  Lauren Williams gave an introduction to cluster algebras in the Current Events section which was perfect for people, like me, who didn’t know the definition.  (The talks by Wei Ho, Sam Payne, and Mladen Bestvina were equally good, but I don’t have any idle questions about them!)

This post will be too long if I try to include the definitions myself, and I wouldn’t do as good a job of exposition as Williams did, so it’s good news that she’s arXived a survey paper which covers roughly the same ground as her talk.

Setup for idle question:  you can get a cluster algebra comes from a process called “seed mutation” — given a rational function field K = k(x_1, … x_m), a labelled seed is a pair (Q,f) where Q is a quiver on m vertices and f = (f_1, … f_m) is a labelling of the vertices of Q with rational functions in K.  For each i, there’s a seed mutation mu_i which is an involution on the labelled seeds; see Williams’s paper for the definition.

Now start with a labelled seed (Q,(x_1, … x_m)) and let T be the set of labelled seeds obtainable from the starting seed by repeated application of seed mutations mu_1, …. m_n for some n < m.  (I didn’t think carefully about the meaning of this special subset of n vertices, which are called the mutable vertices.)

It’s called T because it’s a tree, regular of degree n; each vertex is indexed by a word in the n seed mutations with no letter appearing twice in succession.

Anyway, for each vertex of T and each mutable vertex i you have a rational function f_i.  The cluster algebra is the algebra generated by all these rational functions.

The great miracle — rather, one of the great miracles — is that, by a theorem of Fomin and Zelevinsky, the f_i are all Laurent; that is, their denominators are just monomials in the original functions x_i.

We are now ready for the idle question!

Let’s take k to be a finite field F_q, and let U be (F_q^*)^m, the rational points of the m-torus over F_q.  Choose a point u = (u_1, … u_n) in (F_q^*)^m.

Then for any vertex of T, we can (thanks to the Laurent condition!) evaluate the functions (f_1, …. f_m) at u, getting an element of F_q^m.

So a random walk on the tree T, combined with evaluation at u, gives you a random walk on F_q^m.

Idle question:  Is there a spectral gap for this family of walks, independent of q?

Update:  As David Speyer explains in the comments, this finite random walk is not in general well-defined.  Let me try another phrasing which I think makes sense.

Let t be the endpoint of a length-R random walk on T; then evaluation at (1,1,..1) gives a probability distribution P_{R,N} on (Z/NZ)^m.  Let U_N be the uniform distribution on (Z/NZ)^m.  Now for each N we can ask about the limit

$\Lambda_N = \lim_{R \rightarrow \infty} ||P_{R,N} - U_{N}||^{1/R}$

(I don’t think it matters what norm we use.)

The idea is that the random walk on the tree should be equidistributing mod N, and the speed of convergence is governed by Λ_N.  Then we can ask

Idle question mark 2:  Is Λ_N bounded away from 1 by a constant independent of N?

This is a question in “spectral gap” style, which, if I did it right, doesn’t a priori have to do with a sequence of finite graphs.

Motivation:  this setup reminds me of a very well-known story in arithmetic groups; you have a discrete group Gamma which comes equipped with an action on an set of objects “over Z” — then reducing mod p for all p gives you a family of actions of Gamma on larger and larger finite sets, and a critical organizing question is:  do the corresponding graphs have a spectral gap?

For that matter, what happens if you, say, keep k = C and then evaluate your f_i at (1,1,… 1)?  Looking at a bigger and bigger ball in the tree you get bigger and bigger sets of elements of C^m; what do these look like?  Do they shoot off to infinity, accumulate, equidistribute…..?

## Homological Stability for Hurwitz spaces and the Cohen-Lenstra conjecture over function fields, II

Akshay Venkatesh, Craig Westerland, and I, recently posted a new paper, “Homological Stability for Hurwitz spaces and the Cohen-Lenstra conjecture over function fields, II.” The paper is a sequel to our 2009 paper of the same title, except for the “II.”  It’s something we’ve been working on for a long time, and after giving a lot of talks about this material it’s very pleasant to be able to show it to people at last!

The main theorem of the new paper is that a version of the Cohen-Lenstra conjecture over F_q(t) is true.  (See my blog entry about the earlier paper for a short description of Cohen-Lenstra.)

For instance, one can ask: what is the average size of the 5-torsion subgoup of a hyperelliptic curve over F_q? That is, what is the value of

$\lim_{n \rightarrow \infty} \frac{\sum_C |J(C)[5](\mathbf{F}_q)|}{\sum_C 1}$

where C ranges over hyperelliptic curves of the form y^2 = f(x), f squarefree of degree n?

We show that, for q large enough and not congruent to 1 mod 5, this limit exists and is equal to 2, exactly as Cohen and Lenstra predict. Our previous paper proved that the lim sup and lim inf existed, but didn’t pin down what they were.

In fact, the Cohen-Lenstra conjectures predict more than just the average size of the group $J(C)[5](\mathbf{F}_q)$ as n gets large; they propose a the isomorphism class of the group settles into a limiting distribution, and they say what this distribution is supposed to be! Another way to say this is that the Cohen-Lenstra conjecture predicts that, for each abelian p-group A, the average number of surjections from $J(C)(\mathbf{F}_q)$ to A approaches 1. There are, in a sense, the “moments” of the Cohen-Lenstra distribution on isomorphism classes of finite abelian p-groups.

We prove that this, too, is the case for sufficiently large q not congruent to 1 mod p — but, it must be conceded, the value of “sufficiently large” depends on A. So there is still no q for which all the moments are known to agree with the Cohen-Lenstra predictions. That’s why I call what we prove a “version” of the Cohen-Lenstra conjectures. If you think of the Cohen-Lenstra conjecture as being about moments, we’re almost there — but if you think of it as being about probability distributions, we haven’t started!

Naturally, we prefer the former point of view.

This paper ended up being a little long, so I think I’ll make several blog posts about what’s in there, maybe not all in a row.

## Mochizuki on ABC

[Update:  Lots of traffic coming in from Hacker News, much of it presumably from outside the usual pro number theory crowd that reads this blog.  If you're not already familiar with the ABC conjecture, I recommend Barry Mazur's beautiful expository paper "Questions about Number."]

[Re-update:  Minhyong Kim's discussion on Math Overflow is the most well-informed public discussion of Mochizuki's strategy.  (Of course, it is still very sketchy indeed, as Minhyong hastens to emphasize.)   Both Kim's writeup and discussions I've had with others suggest that the best place to start may be Mochizuki's 2000 paper "A Survey of the Hodge-Arakelov Theory of Elliptic Curves I."]

Shin Mochizuki has released his long-rumored proof of the ABC conjecture, in a paper called “Inter-universal Teichmuller theory IV:  log-volume computations and set-theoretic foundations.”

I just saw this an hour ago and so I have very little to say, beyond what I wrote on Google+ when rumors of this started circulating earlier this summer:

I hope it’s true:  my sense is that there’s a lot of very beautiful, very hard math going on in Shin’s work which almost no one in the community has really engaged with, and the resolution of a major conjecture would obviously create such engagement very quickly.

Well, now the time has come.  I have not even begun to understand Shin’s approach to the conjecture.  But it’s clear that it involves ideas which are completely outside the mainstream of the subject.  Looking at it, you feel a bit like you might be reading a paper from the future, or from outer space.

Let me highlight one point which is clearly important, which I draw from pp.3–6 of the linked paper.

WARNING LABEL:  Of course my attempt to paraphrase is based on the barest of acquaintance with a very small section of the work and is placed here just to get people to look at Mochizuki’s paper — I may have it all wrong!

Mochizuki argues that it is too limiting to think about “the category of schemes over Spec Z,” as we are accustomed to do.  He makes the inarguable point that when X is a kind of thing, it can happen that the category of Xes, qua category, may not tell us very much about what Xes are like — for instance, if there is only one X and it has only one automorphism. Mochizuki argues that the category of schemes over a base is — if not quite this uninformative — insufficiently rich to handle certain problems in Diophantine geometry.  He wants us instead to think about what he calls the “species” of schemes over Spec Z, where a scheme in this sense is not an abstract object in a category, but something cut out by a formula.  In some sense this view is more classical than the conventional one, in which we tend to feel good about ourselves if we can “remove coordinates” and think about objects and arrows without implicitly applying a forgetful functor and referring to the object as a space with a Zariski topology or — ptui! – a set of points.

But Mochizuki’s point of view is not actually classical at all — because the point he wants to make is that formulas can be intepreted in any model of set theory, and each interpretation gives you a different category.  What is “inter-universal” about inter-universal Teichmuller theory is that it is important to keep track of all these categories, or at least many different ones.  What he is doing, he says, is simply outside the theory of schemes over Spec Z, even though it has consequences within that theory — just as (this part is my gloss) the theory of schemes itself is outside the classical theory of varieties, but provides us information about varieties that the classical theory could not have generated internally.

It’s tremendously exciting.  I very much look forward to commentary from people with a deeper knowledge than mine of Mochizuki’s past and present work.

• Algebraists eat corn row by row, analysts eat corn circle by circle.  Yep, I eat down the rows like a typewriter.  Why?  Because it is the right way.
• This short paper by Johan de Jong and Wei Ho addresses an interesting question I’d never thought about; does a Brauer-Severi variety over a field K contain a genus-1 curve defined over K?  They show the answer is yes in dimensions up to 4 (3 and 4 being the new cases.)  In dimension 1, this just asks about covers of Brauer-Severi curves by genus 1 curves; I remember this kind of situation coming up in Ekin Ozman’s thesis, where certain twists of modular curves end up being covers of Brauer-Severi curves.  Which Brauer-Severi varieties are split by twisted modular curves?
• Always nice to see a coherent description of the p-adic numbers in the popular press; and George Musser delivers the goods in Scientific American, in the context of recent work in cosmology by Harlow, Shenker, Stanford, and Susskind.  Two quibbles:  first, if I understood Susskind’s talk on this stuff correctly, the point is to model things by an infinite regular tree.  The fact that when the out-degree is a prime power this happens to look like the Bruhat-Tits tree is in some sense tangential, though very useful for getting an intuitive picture of what’s going on — as long as your intuition is already p-adic, of course!  Second quibble is that Musser seems to suggest at the end that p-adic distances can’t get arbitrarily small:

On top of that, distance is always finite. There are no p-adic infinitesimals, or infinitely small distances, such as the dx and dy you see in high-school calculus. In the argot, p-adics are “non-Archimedean.” Mathematicians had to cook up a whole new type of calculus for them.

Prior to the multiverse study, non-Archimedeanness was the main reason physicists had taken the trouble to decipher those mathematics textbooks. Theorists think that the natural world, too, has no infinitely small distances; there is some minimal possible distance, the Planck scale, below which gravity is so intense that it renders the entire notion of space meaningless. Grappling with this granularity has always vexed theorists. Real numbers can be subdivided all the way down to geometric points of zero size, so they are ill-suited to describing a granular space; attempting to use them for this purpose tends to spoil the symmetries on which modern physics is based.

## Gonality, the Bogomolov property, and Habegger’s theorem on Q(E^tors)

I promised to say a little more about why I think the result of Habegger’s recent paper, ” Small Height and Infinite Non-Abelian Extensions,” is so cool.

First of all:  we say an algebraic extension K of Q has the Bogomolov property if there is no infinite sequence of non-torsion elements x in K^* whose absolute logarithmic height tends to 0.  Equivalently, 0 is isolated in the set of absolute heights in K^*.  Finite extensions of Q evidently have the Bogomolov property (henceforth:  (B)) but for infinite extensions the question is much subtler.  Certainly $\bar{\mathbf{Q}}$ itself doesn’t have (B):  consider the sequence $2^{1/2}, 2^{1/3}, 2^{1/4}, \ldots$  On the other hand, the maximal abelian extension of Q is known to have (B) (Amoroso-Dvornicich) , as is any extension which is totally split at some fixed place p (Schinzel for the real prime, Bombieri-Zannier for the other primes.)

Habegger has proved that, when E is an elliptic curve over Q, the field Q(E^tors) obtained by adjoining all torsion points of E has the Bogomolov property.

What does this have to do with gonality, and with my paper with Chris Hall and Emmanuel Kowalski from last year?

Suppose we ask about the Bogomolov property for extensions of a more general field F?  Well, F had better admit a notion of absolute Weil height.  This is certainly OK when F is a global field, like the function field of a curve over a finite field k; but in fact it’s fine for the function field of a complex curve as well.  So let’s take that view; in fact, for simplicity, let’s take F to be C(t).

What does it mean for an algebraic extension F’ of F to have the Bogomolov property?  It means that there is a constant c such that, for every finite subextension L of F and every non-constant function x in L^*, the absolute logarithmic height of x is at least c.

Now L is the function field of some complex algebraic curve C, a finite cover of P^1.  And a non-constant function x in L^* can be thought of as a nonzero principal divisor.  The logarithmic height, in this context, is just the number of zeroes of x — or, if you like, the number of poles of x — or, if you like, the degree of x, thought of as a morphism from C to the projective line.  (Not necessarily the projective line of which C is a cover — a new projective line!)  In the number field context, it was pretty easy to see that the log height of non-torsion elements of L^* was bounded away from 0.  That’s true here, too, even more easily — a non-constant map from C to P^1 has degree at least 1!

There’s one convenient difference between the geometric case and the number field case.  The lowest log height of a non-torsion element of L^* — that is, the least degree of a non-constant map from C to P^1 — already has a name.  It’s called the gonality of C.  For the Bogomolov property, the relevant number isn’t the log height, but the absolute log height, which is to say the gonality divided by [L:F].

So the Bogomolov property for F’ — what we might call the geometric Bogomolov property — says the following.  We think of F’ as a family of finite covers C / P^1.  Then

(GB)  There is a constant c such that the gonality of C is at least c deg(C/P^1), for every cover C in the family.

What kinds of families of covers are geometrically Bogomolov?  As in the number field case, you can certainly find some families that fail the test — for instance, gonality is bounded above in terms of genus, so any family of curves C with growing degree over P^1 but bounded genus will do the trick.

On the other hand, the family of modular curves over X(1) is geometrically Bogomolov; this was proved (independently) by Abramovich and Zograf.  This is a gigantic and elegant generalization of Ogg’s old theorem that only finitely many modular curves are hyperelliptic (i.e. only finitely many have gonality 2.)

At this point we have actually more or less proved the geometric version of Habegger’s theorem!  Here’s the idea.  Take F = C(t) and let E/F be an elliptic curve; then to prove that F(E(torsion)) has (GB), we need to give a lower bound for the curve C_N obtained by adjoining an N-torsion point to F.  (I am slightly punting on the issue of being careful about other fields contained in F(E(torsion)), but I don’t think this matters.)  But C_N admits a dominant map to X_1(N); gonality goes down in dominant maps, so the Abramovich-Zograf bound on the gonality of X_1(N) provides a lower bound for the gonality of C_N, and it turns out that this gives exactly the bound required.

What Chris, Emmanuel and I proved is that (GB) is true in much greater generality — in fact (using recent results of Golsefidy and Varju that slightly postdate our paper) it holds for any extension of C(t) whose Galois group is a perfect Lie group with Z_p or Zhat coefficients and which is ramified at finitely many places; not just the extension obtained by adjoining torsion of an elliptic curve, for instance, but the one you get from the torsion of an abelian variety of arbitrary dimension, or for that matter any other motive with sufficiently interesting Mumford-Tate group.

Question:   Is Habegger’s theorem true in this generality?  For instance, if A/Q is an abelian variety, does Q(A(tors)) have the Bogomolov property?

Question:  Is there any invariant of a number field which plays the role in the arithmetic setting that “spectral gap of the Laplacian” plays for a complex algebraic curve?

A word about Habegger’s proof.  We know that number fields are a lot more like F_q(t) than they are like C(t).  And the analogue of the Abramovich-Zograf bound for modular curves over F_q is known as well, by a theorem of Poonen.  The argument is not at all like that of Abramovich and Zograf, which rests on analysis in the end.  Rather, Poonen observes that modular curves in characteristic p have lots of supersingular points, because the square of Frobenius acts as a scalar on the l-torsion in the supersingular case.  But having a lot of points gives you a lower bound on gonality!  A curve with a degree d map to P^1 has at most d(q+1) points, just because the preimage of each of the q+1 points of P^1(q) has size at most d.  (You just never get too old or too sophisticated to whip out the Pigeonhole Principle at an opportune moment….)

Now I haven’t studied Habegger’s argument in detail yet, but look what you find right in the introduction:

The non-Archimedean estimate is done at places above an auxiliary prime number p where E has good supersingular reduction and where some other technical conditions are met…. In this case we will obtain an explicit height lower bound swiftly using the product formula, cf. Lemma 5.1. The crucial point is that supersingularity forces the square of the Frobenius to act as a scalar on the reduction of E modulo p.

Yup!  There’s no mention of Poonen in the paper, so I think Habegger came to this idea independently.  Very satisfying!  The hard case — for Habegger as for Poonen — has to do with the fields obtained by adjoining p-torsion, where p is the characteristic of the supersingular elliptic curve driving the argument.  It would be very interesting to hear from Poonen and/or Habegger whether the arguments are similar in that case too!

## Poonen-Rains, Selmer groups, random maximal isotropics, random orthogonal matrices

At the AIM workshop on Cohen-Lenstra heuristics last week I got to hear Bjorn Poonen give a terrific talk about his recent work with Eric Rains about the distribution of mod p Selmer groups in a quadratic twist family of elliptic curves.

Executive summary:  if E is an elliptic curve, say in Weierstrass form y^2 = f(x), and d is a squarefree integer, then we can study the mod p Selmer group Sel_d(E) of the quadratic twist dy^2 = f(x), which sits inside the Galois cohomology H^1(G_Q, E_d[p]).  This is a finite-dimensional vector space over F_p.  And by analogy with the Cohen-Lenstra heuristics for class groups, we can ask whether these groups obey a probability distribution as d varies — that is, does

Pr(dim Sel_d(E) = r | d in [-B, ... B])

approach a limit P_r as B goes to infinity, and if so, what is it?

The Poonen-Rains heuristic is based on the following charming observation.  The product of the local cohomology groups H_1(G_v, E[p]) is an infinite-dimensional F_p-vector space endowed with a bilinear form coming from cup product.  In here you have two subspaces:  the image of global cohomology, and the image of local Mordell-Weil.  Each one of these, it turns out, is maximal isotropic — and their intersection is exactly the Selmer group.  So the Selmer group can be seen as the intersection of two maximal isotropic subspaces in a very large quadratic space.

Heuristically, one might think of these two subspaces as being randomly selected among maximal isotropic subspaces.  This suggests a question:  if P_{r,N} is the probability that the intersection of two random maximal isotropics in F_p^{2N} has dimension r, does P_{r,N} approach a limit as N goes to infinity?  It does — and the Poonen-Rains heuristic then asks that the probability that dim Sel_d(E)  = r approaches the same limit.  This conjecture agrees with theorems of Heath-Brown, Swinnerton-Dyer, and Kane in the case p=2, and with results of Bhargava and Shankar when p <= 5 (Bhargava and Shankar work with a family of elliptic curves of bounded height, not a quadratic twist family, but it is not crazy to expect the behavior of Selmer to be the same.)  And in combination with Delaunay’s heuristics for variation of Sha, it recovers Goldfeld’s conjecture that elliptic curves are half rank 0 and half rank 1.

Johan de Jong wrote about a similar question, concentrating on the function field case, in his paper “Counting elliptic surfaces over finite fields.”  (This is the first place I know where the conjecture “Sel_p should have size 1+p on average” is formulated.)  He, too, models the Selmer group by a “random linear algebra” construction.  Let g be a random orthogonal matrix over F_p; then de Jong’s model for the Selmer group is coker(g-1).  This is a natural guess in the function field case:  if E is an elliptic curve over a curve C / F_q, then the Selmer group of E is a subquotient of the etale H^2 of an elliptic surface S over F_q; thus it is closely related to the coinvariants of Frobenius acting on the H^2 of S/F_qbar.  This H^2 carries a symmetric intersection pairing, so Frobenius (after scaling by q) is an orthogonal matrix, which we want to think of as “random.”  (As first observed by Friedman and Washington, the Cohen-Lenstra heurstics can be obtained in similar fashion, but the relevant cohomology is H^1 of a curve instead of H^2 of a surface; so the relevant pairing is alternating and the relevant statistics are those of symplectic rather than orthogonal matrices.)

But this presents a question:  why do these apparently different linear algebra constructions yield the same prediction for the distribution of Selmer ranks?

Here’s one answer, though I suspect there’s a slicker one.

A nice way to describe the distributions that arise in problems of Cohen-Lenstra type is by computing their moments.  But the usual moments (e.g. “expected kth power of dimension of Selmer” or “kth power of order of Selmer” tend not to behave so well.  Better is to compute “expected number of injections from F_p^k into Selmer,” which has a cleaner answer in every case I know.  If the size of the Selmer group is X, this number is

(X-1)(X-p)….(X-p^{k-1}).

Evidently, if you know these “moments” for all k, you can compute the usual moments E(X^k) (which are indeed computed explicitly in Poonen-Rains) and vice versa.

Now:  let A be the random variable (valued in abelian groups!)  “intersection of two random maximal isotropics in a 2N-dimensional quadratic space V” and B be “coker(g-1) where g is a random orthogonal N x N matrix.”

The expected number of injections from F_p^k to B is just the number of injections from F_p^k to F_p^N which are fixed by g.  By Burnside’s lemma, this is the number of orbits of the orthogonal group on Inj(F_p^k, F_p^N).  But by Witt’s Theorem, the orbit of an injection f: F_p^k -> F_p^N is precisely determined by the restriction of the orthogonal form to F_p^k; the number of symmetric bilinear forms on F_p^k is p^((1/2)k(k+1)) and so this is the expected value to be computed.

What about the expected number of injections from F_p^k to A?  We can compute this as follows.  There are about p^{Nk} injections from F_p^k to V.  Of these, about p^{2Nk – (1/2)k(k+1)} have isotropic image.  Call the image W;  we need to know how often W lies in the intersection of the two maximal isotropics V_1 and V_2.  The probability that W lies in V_1 is easily seen to be about p^{-Nk + (1/2)k(k+1)}, and the probability that W lies in V_2 is the same; these two events are independent, so the probability that W lies in A is about p^{-2NK + (1/2)k(k+1)}.  Summing over all isotropic injections gives an expected number of p^{(1/2)k(k+1)} injections from F_p^k to A.  Same answer!

(Note:  in the above paragraph, “about” always means “this is the limit as N gets large with k fixed.”)

What’s the advantage of having two different “random matrix” formulations of the heuristic?  The value of the “maximal isotropic intersection” model is clear — as Poonen and Rains show, the Selmer really is an intersection of maximal isotropic subspaces in a quadratic space.  One value of the “orthogonal cokernel” model is that it’s clear what it says about the Selmer group mod p^k.

Question: What does the orthogonal cokernel model predict about the mod-4 Selmer group of a random elliptic curve?  Does this agree with the theorem of Bhargava and Shankar, which gives the first moment of Sel_4 in a family of elliptic curves ordered by height?