How many points does the average curve have?

Felipe Voloch complained that I didn’t list Ruby’s BBQ in my last post as one of the charms of visiting UT. I’ll make it up to him by observing that one of the charms of visiting UT is talking math with Felipe! He asked me an interesting question, about which we had different intuitions — I’ll present the question here and those readers who have an opinion are encouraged to voice it. (Math below the fold to avoid shocking the modesty of non-mathy readers.)

Let X be a curve of genus g over the field F_q. We know by the Weil conjectures (actually, the Weil theorems, since he proved them in the case of curves) that

$q+1 - 2g\sqrt{q} \leq |X(F_q)| \leq q + 1 + 2g\sqrt{q}.$

If q is very large compared to g, one sees that

$|X(F_q)| \sim q+1;$

and, in particular, the number of points on X is not sensitive to the genus.

When q is fixed and g grows, the story is quite different. The Weil bounds show that

$|X(F_q)|/g \ll 2\sqrt{q}$

but this bound isn’t sharp: Drinfeld and Vladuts showed the better bound

$\limsup |X(F_q)|/g \leq \sqrt{q}-1.$

Let A_q be the lim sup, over all curves X/F_q of all genera, of |X(F_q)|/g(X). The Drinfeld-Vladuts bound tells us that

$A_q \leq \sqrt{q}-1$

and in fact this inequality is an equality when q is a square; various modular curves provide examples of curves meeting the bound. When q is not a square, I don’t think very much is known.

But this is not Felipe’s question. He asked: what can be said about the number of points on an average curve of genus g over F_q, when g is large compared to q? In other words, define

$B_q = \limsup_g (1/g) \frac{\sum_X |X(F_q)|}{\sum_X 1}$

where the sum is over all isomorphism classes of curves X/F_q of genus g. In particular, is B_q equal to 0? Felipe guesses it isn’t. I guess it is. What do you think?

(Note that $\sum_X 1$, the size of the set we are averaging over, is itself a pretty mysterious quantity! It’s otherwise known as the number of F_q-rational points on M_g. A preprint of de Jong and Katz, available on de Jong’s home page if you can read .dvi, gives an upper bound, which is presumably far from the truth for most g and q.)

Perhaps a relevant question is the following. When g is very large, |X(F_q)| is essentially the trace of Frob_q acting on the etale H^1 of X; we can think of Frob_q concretely as a 2gx2g matrix in the generalized symplectic group GSp_2g. More precisely, it is in the coset of Sp_2g which multiplies the alternating form by q. Denote this coset by C_q.

It is pleasant and increasingly customary to guess that Frob_q behaves like a random element of C_q. And if we let q get large with g fixed, this kind of guess can be proven correct by the Weil conjectures.

When q is fixed and g grows, the story is different. Indeed, a random element of C_q might well have trace less than -q-1; this isn’t possible for Frob_q, since X can’t have a negative number of points!

So here’s a question for random matrix lovers:

QUESTION: Let g be very large relative to q, and let M be a random element of C_q, conditional on the fact that

$Tr(M^k) \leq -q^k-1$

for all k > 0. What is the expected value of Tr(M)/g? Especially: is it 0?

16 thoughts on “How many points does the average curve have?”

1. There’s a recent preprint of Kurlberg and Rudnick (arXiv:0804.0808) where they consider the fluctuations of the number of points of hyperelliptic curves of genus g for a fixed number field; they find that for g large, it is q+1+S, where S is distributed, as g grows, like a sum of q independent identically distributed random variables taking values 0 with probability 1/(q+1), and +/- 1 with probability 1/(2(1+1/q)). This has mean zero, giving an expect number of points equal to q+1. This doesn’t depend on the genus, so the analogue of your B_g is indeed zero in that case, but on the other hand, the q+1 means that the genus does not come to dominate the number of points. (So it’s not clear for _what_ it is evidence, in the general case…)

I think for the RMT question, I don’t know the answer, but it is definitely that the trace of unitary symplectic matrices behaves like a _standard_ gaussian (with variance 1) as g grows (this is surprising from the eigenvalue point of view, where one sums 2g numbers of modulus one; as explained by Diaconis in
http://www.ams.org/bull/2003-40-02/S0273-0979-03-00975-3/home.html
it is better here to see the trace as the sum of diagonal entries, each of which is of magnitude roughly 1/sqrt(g) by unitarity, assuming all coefficients of the matrix have comparable size; so square-root cancellation in the sum of diagonal coefficients suggests a variance 1).

De-unitarizing, this gives a trace over C_q which is gaussian with variance q (so standard deviation sqrt(q)), and one can expect that at least the trace be >-q-1 with very high probability.

However, when taking very high traces, things are different: Rains has shown that the trace of M^k is exactly that of the sum of g independent variables uniformly distributed on the unit circle for a unitary matrix of size g and k>g. This is going to be close to gaussian when g is large, but now with variance g. This means the “conditioning” you mention selects a probability zero event, I think…

But still, it is also known that the traces of various powers tend to be independent, so one may still think that the first result describes what would be the reasonable limit. Adding the q+1, this suggests a typical number of point like q+1+sqrt(q)*(gaussian), and the limit divided by g would be zero.

2. P.S. The smileys are entirely accidental…

3. JSE says:

But I think the condition is vacuous once k >> log_q g, since the kth power eigenvalues are of size q^{k/2} and once k is this big they can’t add up to something as large as q^k. So do these “small traces” all behave like Tr(M) itself?

4. Right, I forgot that. It is true that for fixed k, the traces of M, M^2, …, M^k behave, as
g–> infinity, like independent gaussians, with Tr M^j of variance j (this is again when M is unitarily normalized). I don’t know the error terms to check what happens when k is allowed to grow with k (slowly, like log g). That does suggest that the condition is true with high probability, and then that the trace, with this condition, is indeed o(g).

5. JSE says:

By the way, the paper of Kurlberg and Rudnick isn’t so relevant to this question, because hyperelliptic curves have at most 2q+2 points! So in this family, the average is evidently zero. That’s one thing that makes this problem difficult from my point of view — it’s hard to do any numerical experimentation, because it’s hard to think of families of curves over F_q that a) you can write down, and b) don’t have obvious bounds, independent of g, on the number of points! Low-gonality curves, plane curves or for that matter curves on any fixed variety…. none of these do the trick.

6. That’s true, but on the other hand, RMT arguments are not going to distinguish hyperelliptic curves from general curves either, since the monodromy groups are the same.

7. JSE says:

Touche.

8. All this certainly indicates that the question is quite difficult indeed…

9. JSE says:

But I will say that it doesn’t seem crazy to imagine that the random matrix model is powerless to detect really fine-grained phenomena like the maximum A_q, while it might be better equipped to say something about the average B_q. We know that the value of A_q really depends on whether we quantify over hyperelliptic curves or all curves — but B_q, maybe not.

10. I don’t have the Katz-Sarnak book handy to check whether they explicitly conjecture or not that the limit of over g going to infinity, for the moduli space of curves, should give the RMT limit (instead of the proved “q goes to infinity, then g goes to infinity”; I remember they discuss a few examples where they expect this to hold).

Certainly, it is probabilistically very reasonable to expect that lots of phenomena will not be detectable at the RMT limit, since this happens quite often (indeed, many questions do not make sense in the limit, for instance anything that involves algebraic properties of the eigenvalues).

11. Jason Starr says:

What about curves with huge automorphism group, like fiber products of Artin-Schreier covers of P^1, i.e., y^q + y = f(x)? You can write those down and they have lots of points.
Or are there too few of them compared to the genus (as compared to the number of hyperelliptic curves of the same genus) to significantly contribute to the average?

12. Checking Katz-Sarnak, they explicitly conjecture (Conjecture, page 12, examples 1, 2, 2bis, page 13) that (at least as far as eigenvalue-location statistics are concerned) both the moduli space of curves (incarnated as moduli of curves with 3K-structure, more precisely) and the space of hyperelliptic curves satisfy the same limit theorem towards RMT as g grows.

13. Thanks, Jordan!

Since I’m already risking being very wrong, I’d strengthen my conjecture to the lim-inf being positive.

Jason: It is known that there are curves with cg points for every genus and taking double covers of these ramified at O(g) places one gets about q^{2g} curves with cg points (different c of course), so a little less but about the same number as hyperelliptic curves. Still we don’t even know if the number of curves is q^{3g-3} (de Jong-Katz only prove an upper bound of q^{g log g}) so anything we can produce explicitly will have zero influence in the average including, I suspect, your suggestion.

14. I’d be wary of RMT arguments for curves of large genus. There is this paper of Buser and Sarnak (Inv. Math. 117) where they show that the period matrix of a complex curve does not look like the period matrix of a random complex abelian variety and in some sense the image of the Torelli map is near the boundary of the moduli space. I wonder if there is an analogue of this for the matrix of Frobenius over finite fields.

15. Although RMT arguments are definitely in a very unclear situation for growing genus for all curves, there is starting to be some amount of evidence at least for hyperelliptic ones: there’s the paper of Kurlberg and Rudnick already mentioned, another one by Faifman and Rudnick (also very recent, about the number of eigenangles zeros in an interval — although since the corresponding analogue is known for zeta(s), the RMT nature of the argument is not very strong).

Numerically, it’s also starting to be doable to look at curves of fairly large genus using Magma, for instance. As an example, I computed at some point the L-functions of 1000 “random” hyperelliptic curves of genus 49 over F_5 (Magma took about one hour for each curve). See the last picture in the link below for the graph of the empirical distribution of the eigenangle closest to 1, which is one statistic known to distinguish between the various families of compact Lie groups:

http://www.math.u-bordeaux1.fr/~kowalski/arithmatrics/zeta-functions.html

the picture just above is the corresponding one for about 120 000 curves of genus 3 over F_{5^8}; it is known that the RMT picture holds when the genus is fixed and the field grows, and indeed this second picture is close to the theoretical curve. The genus 49 case, considering the small sample, is really not that far either. (I should probably work at extending the computations, actually).