Yuri Bilu and Pierre Parent posted a beautiful paper on the arXiv last week, settling part of a very old problem about the mod-p Galois representations attached to elliptic curves over Q.

If E is an elliptic curve over Q, the action of Galois on the p-torsion points of E yields a Galois representation

rho_{E,p}: Gal(Q) -> GL_2(F_p).

A famous theorem of Serre tells us that if E does not have complex multiplication, then rho_{E,p} is surjective for p large enough. But what “large enough” means depends, a priori, on E.

In practice, one seldom comes across an elliptic curve without CM such that rho_{E,p} is non-surjective. Thus the conjecture, originally due to Serre and now very widely believed, that “large enough” *need not* depend on the elliptic curve; that is, there is some absolute constant P such that rho_{E,p} is surjective for all non-CM elliptic curves over Q and all p > P.

More number theory below the fold:

The first step in attacking this problem is to ask: what does rho_{E,p} look like when it isn’t surjective? Its image has to land in some maximal proper subgroup of GL_2(F_p), and these aren’t hard to classify. In fact, either:

- rho_{E,p} is surjective;
- The image of rho_{E,p} is contained in a Borel;
- The image of rho_{E,p} is contained in the normalizer of a split Cartan;
- The image of rho_{E,p} is contained in the normalizer of a non-split Cartan;
- The image of rho_{E,p} is contained in one of a finite list of “exceptional subgroups.

Case 5 is easy to rule out for large p. Case 2 is the subject of one of Mazur’s most famous theorems: if p is greater than 163, the image of rho_{E,p} cannot be contained in a Borel.

This leaves cases 3 and 4, the normalizers of Cartan. Case 4 appears to be the hardest — to imitate Mazur’s argument on the Borel case, one needs the Jacobian of the relevant modular curve to have some quotients with Mordell-Weil rank 0. And when the modular curve in question is the one parametrizing elliptic curves whose Galois representations are in case 4 — a curve usually called “X_{non-split}(p)” — the Jacobian ought not to have any such quotients. Indeed, under Birch-Swinnerton-Dyer, all of its quotients will have odd Mordell-Weil rank.

That leaves case 3, the subject of Bilu and Parent’s theorem. One can go a certain distance down the path laid out by Mazur; a 1984 result of Momose shows that if E is in case 3, and p is larger than 13, then E has potentially good reduction at all primes not dividing 6. (I don’t have the paper in front of me as I write, so I only vouch that the theorem statement is “essentially” correct!) This fact is already quite useful for “Fermat-style” diophantine applications. But one really wants to know that, for p > P, the only elliptic curves in case 3 are the CM curves.

And at this point one gets stuck for 24 years.

I’ll explain a little about how Bilu and Parent’s theorem works; but the paper itself is so short that I feel somewhat freed from the responsibility to summarize it!

Suppose we have an elliptic curve E/Q whose mod-p Galois representation has image in the normalizer of a split Cartan. We are going to derive a contradiction by ruling out two classes of curves: those whose height is large compared to p, and those whose height is small compared to p. If “large” and “small” are defined suitably, these two classes together will account for all the curves.

If you don’t like the word “height”, just keep in mind that j(E) is an integer by Momose’s theorem — from now on we’ll think of “height” and “log j(E)” as more or less synonymous.

The small-height curves are ruled out by a theorem of Masser and Wustholtz, which tells you that if two elliptic curves are isogenous, there is an isogeny between them of degree at most O(height(E))^2. Now our curve E, after passing to a quadratic extension K/Q, is isogenous to some other curve E’/K via an isogeny of degree p. This is the unique isogeny (up to sign) from E to E’, because E is not CM. (We had to use this hypothesis somewhere!) So Masser and Wustholtz tell us that height(E) must be pretty big; on order of sqrt(p), at least.

On the other hand, Bilu and Parent observe, one can construct a very useful “modular unit” which they denote w_a: this is a function on X_split(p) which has all the zeroes and poles supported on cusps. (Why should such a function exist? Because the group generated by cusps in the Jacobian of X_split(p) is torsion, which means there are tons of relations between the cusps, i.e. tons of principal divisors supported on cusps.)

Now our elliptic curve E yields a point P on X_split(p)(Q) which, by Momose, doesn’t reduce to a cusp mod ell for any prime ell. But this means in particular that w_a(P) doesn’t reduce to either 0 or infinity mod ell for any ell; in other words, it’s a unit in Z^*; either 1 or -1. Well, not quite! My ultra-loose description hides that there are of course some problems at p, and what’s in fact the case is that w_a(P) is a power of p, which can be explicitly bounded.

Finally, Bilu and Parent use the asymptotic behavior of w_a near the cusps to show that the size of j(E) can be bounded in terms of the size of w_a(P); in particular, they find that log j(E) is at most on order of 12 log p. But the Masser-Wustholtz argument tells us that log j(E) is already as big as sqrt(p) In other words, we’ve ruled out large-height points as well as small-height points; and thus case 3 cannot arise for p large enough.

Sweet!

Now assuming GRH