## Naser Talebizadeh Sardari, Hecke eigenvalues, and Chabauty in the deformation space

Naser Sardari is finishing a postdoc at Wisconsin this year and just gave a beautiful talk about his new paper.  Now Naser thinks of this as a paper about automorphic forms — and it is — but I want to argue that it is also a paper which develops an unexpected new form of the Chabauty method!  As I will now explain.  Tell me if you buy it.

First of all, what does Naser prove?  As the title might suggest, it’s a statement about the multiplicity of Hecke eigenvalues a_p; in this post, we’re just going to talk about the eigenvalue zero.  The Hecke operator T_p acts on the space of weight-k modular forms on Gamma_0(N); how many zero eigenvectors can it have, as k goes to infinity with N,p fixed?  If you believe conjectures of Maeda type, you might expect that the Hecke algebra acts irreducibly on the space S_k(Gamma_0(N)); of course this doesn’t rule out that one particular Hecke operator might have some zeroes, but it should make it seem pretty unlikely.

And indeed, Naser proves that the number of zero eigenvectors is bounded independently of k, and even gives an explicit upper bound. (When the desired value of a_p is nonzero, T_p has finite slope and we can reduce to a problem about modular forms in a single p-adic family; in that context, a uniform bound is easy, and one can even show that the number of such forms of weight <k grows very very very very slowly with k, where each "very" is a log; this is worked out on Frank Calegari’s blog.. On the other hand, as Naser points out below in comments, if you ask about the “Hecke angle” a_p/p^{(k-1)/2}, we don’t know how to get any really good bound in the nonzero case. I think the conjecture is that you always expect finite multiplicity in either setting even if you range over all k.)

What I find most striking is the method of proof and its similarity to the Chabauty method!  Let me explain.  The basic idea of Naser’s paper is to set this up in the language of deformation theory, with the goal of bounding the number of weight-k p-adic Galois representations rho which could be the representations attached to weight-k forms with T_p = 0.

We can pin down the possible reductions mod p of such a form to a finite number of possibilities, and this number is independent of k, so let’s fix a residual representation rhobar once and for all.

The argument takes place in R_loc, the ring of deformations of rhobar|G_{Q_p}.  And when I say “the ring of deformations” I mean “the ring of deformations subject to whatever conditions are important,” I’m just drawing a cartoon here.  Anyway, R_loc is some big p-adic power series ring; or we can think of the p-adic affine space Spec R_loc, whose Z_p-points we can think of as the space of deformations of rhobar to p-adic local representations.  This turns out to be 5-dimensional in Naser’s case.

Inside Spec R_loc, we have the space of local representations which extend to global ones; let’s call this locus Spec R_glob.  This is still a p-adic manifold but it’s cut out by global arithmetic conditions and its dimension will be given by some computation in Galois cohomology over Q; it turns out to be 3.

But also inside Spec R_loc, we have a submanifold Z cut out by the condition that a_p is not just 0 mod p, it is 0 on the nose, and that the determinant is the kth power of cyclotomic for the particular k-th power you have in mind.  This manifold, which is 2-dimensional, is something you could define without ever knowing there was such a thing as Q; it’s just some closed locus in the deformation space of rhobar|Gal(Q_p).

But the restriction of rho to Gal(Q_p) is a point psi of R_loc which has to lie in both these two spaces, the local one which expresses the condition “psi looks like the representation of Gal(Q_P) attached to a weight-k modular form with a_p = 0” and the global one which expresses the condition “psi is the restriction to Gal(Q_p) of representation of Gal(Q) unramified away from some specified set of primes.”  So psi lies in the intersection of the 3-dimensional locus and the 2-dimensional locus in 5-space, and the miracle is that you can prove this intersection is transverse, which means it consists of a finite set of points, and what’s more, it is a set of points whose cardinality you can explicitly bound!

If this sounds familiar, it’s because it’s just like Chabauty.  There, you have a curve C and its Jacobian J.  The analogue of R_loc is J(Q_p), or rather let’s say a neighborhood of the identity in J(Q_p) which looks like affine space Q_p^g.

The analogue of R_glob is (the p-adic closure of) J(Q), which is a proper subspace of dimension r, where r is the rank of J(Q), something you can compute or at least bound by Galois cohomology over Q.  (Of course it can’t be a proper subspace of dimension r if r >= g, which is why Chabauty doesn’t work in that case!)

The analogue of Z is C(Q_p); this is something defined purely p-adically, a locus you could talk about even if you had no idea your C/Q_p were secretly the local manifestation of a curve over Q.

And any rational point of C(Q), considered as a point in J(Q_p), has to lie in both C(Q_p) and J(Q), whose dimensions 1 and at most g-1, and once again the key technical tool is that this intersection can be shown to be transverse, whence finite, so C(Q) is finite and you have Mordell’s conjecture in the case r < g.  And, as Coleman observed decades after Chabauty, this method even allows you to get an explicit bound on the number of points of C(Q), though not an effective way to compute them.

I think this is a very cool confluence indeed!  In the last ten years we've seen a huge amount of work refining Chabauty; Matt Baker discusses some of it on his blog, and then there’s the whole nonabelian Chabauty direction launched by Minhyong Kim and pushed forward by Jen Balakrishnan and Netan Dogra and many others.  Are there other situations in which we can get meaningful results from “deformation-theoretic Chabauty,” and are the new technical advances in Chabauty methods relevant in this context?

## A type of unproductivity

There is a certain very special type of unproductivity that I have experienced only in math.  You are working on something and you feel almost certain your strategy is not going to work.  In fact, it is more likely than not that your strategy is not even a new strategy, but a variant on something you’ve already tried unsuccessfully — or maybe not even actually a variant, but just a rephrasing you’ve fooled yourself into thinking is a variant.  So you are not sure whether you are actually working on something at all.  In fact, you doubt it.

And yet you keep going!  Because what if it works?

Tagged ,

## Heights on stacks and heights on vector bundles over stacks

I’ve been giving a bunch of talks about work with Matt Satriano and David Zureick-Brown on the problem of defining the “height” of a rational point on a stack.  The abstract usually looks something like this:

Here are two popular questions in number theory:

1.  How many degree-d number fields are there with discriminant at most X?
2.  How many rational points are there on a cubic surface with height at most X?

Our expectations about the first question are governed by Malle’s conjecture; about the second, by the Batyrev-Manin conjecture.  The forms of the conjectures are very similar, predicting in both cases an asymptotic of the form c X^a (log X)^b, and this is no coincidence: I will explain how to think of both questions in a common framework, that of counting points of bounded height on an algebraic stack.  A serious obstacle is that there is no definition of the height of a rational point on a stack.  I will propose a definition and try to convince you it’s the right one.  If there’s time, I’ll also argue that when we talk about heights with respect to a line bundle we have always secretly meant “vector bundle,” or should have.

(joint work with Matt Satriano and David Zureick-Brown)

Frank Calegari asked a good question after I talked about this at Mazur’s birthday conference.  And other people have asked me the same question!  So I thought I’d write about it here on the blog.

An actual (somewhat tangential) math question about your talk: when it comes (going back to the original problem) of extensions with Galois group G, there is (as you well know) a natural cover $\mathbf{A}^n/G \rightarrow \cdot/G,$ and the source has a nice smooth unirational open subscheme which is much less stacky object and could possibly still be used to count G-extensions (or rather, to count G-polynomials). How does this picture interact (if at all) with your talk or the Malle conjecture more generally?

Here’s an answer.  Classically, how do we count degree-n extensions of Q?  We count monic degree-n polynomials with bounded coefficients; that is, we count integral points of bounded height on A^n / S_n, which is isomorphic to A^n, the space of monic degree-n polynomials.

Now A^n / S_n is the total space of a vector bundle over the stack B(S_n).  So you might say that what we’re doing is using “points on the total space of a vector bundle E/X as a proxy for points on X.”  And when you put it that way, you see that it’s what people who work on rational points do all the time!  What do we do when we count rational points on P^1?  We count pairs of coprime integers in a box; in other words, we count integral points on A^2 – 0, which is the total space (sans zero section) of a line bundle on P^1.  More generally, in many cases where people can prove the Batyrev-Manin conjecture for a variety X, it’s precisely by means of passing to a “universal torsor” — the total space of a vector bundle (or an torus bundle sitting in a vector bundle) over X.  Sometimes you can use this technique to get actual asymptotics for rational points on X; other times you just get bounds; if you can prove that, for any x in X(Q), there is a point on the fiber E_x whose height is at most F(height(x)) for some reasonable function F, you can parlay upper bounds for points on E into upper bounds for points on X.  In the classical case, this is the part where we argue that (by Minkowski) a number field with discriminant D contains an algebraic integer whose characteristic polynomial has coefficients bounded in terms of D.

So coming back to the original question:  how do you know which vector bundle on BG is a good one to think about?  Actually, this is far from clear!  The very first thing I ever wrote about counting number fields, my first paper with Akshay, gave new upper bounds for the number of degree-n extensions, by counting points on

$(\mathbf{A}^n)^m / S_n$

where S_n acts diagonally.  In other words, we used a different vector bundle on B(S_n) than the “standard” one, and showed that by optimizing m (and being careful about stripping out loci playing the role of accumulating subvarieties) we could get better upper bounds than the ones coming from counting polynomials.

So apparently I’ve been counting points on vector bundles on stacks all along…!

## The Lovasz number of the plane is about 3.48287

As seen in this comment on Polymath and explicated further in Fernando de Oliveira Filho’s thesis, section 4.4.

I actually spent much of today thinking about this so let me try to explain it in a down-to-earth way, because it involved me thinking about Bessel functions for the first time ever, surely a life event worthy of recording.

So here’s what we’re going to do.  As I mentioned last week, you can express this problem as follows:  suppose you have a map h: R^2 -> V, for some normed vector space V, which is a unit-distance embedding; that is, if |x-x’|_{R^2} = 1, then |h(x)-h(x’)|_V = 1.  (We don’t ask that h is an isometry, only that it preserves the distance-1 set.)

Then let t be the radius of the smallest hypersphere in V containing h(R^2).

Then any graph embeddable in R^2 with all edges of length 1 is sent to a unit-distance graph in V contained in the hyperplane of radius t; this turns out to be equivalent to saying the Lovasz number of G (ok, really I mean the Lovasz number of the complement of G) is at most 1/(1-2t).  So we want to show that t is bounded below 1, is the point.  Or rather:  we can find a V and a map from R^2 to V to make this the case.

So here’s one!  Let V be the space of L^2 functions on R^2 with the usual inner product.  Choose a square-integrable function F on R^2 — in fact let’s normalize to make F^2 integrate to 1 — and for each a in R^2 we let h(a) be the function F(x-a).

We want the distance between F(x-a) and F(x-b) to be the same for every pair of points at distance 1 from each other; the easiest way to arrange that is to insist that F(x) be a radially symmetric function F(x) = f(|x|); then it’s easy to see that the distance between F(x-a) and F(x-b) in V is a function G(a-b) which depends only on |a-b|.  We write

$g(r) = \int_{\mathbf{R}^2} F(x)F(x-r) dx$

so that the squared distance between F(x) and F(x-r) is

$\int F(x)^2 dx - 2 \int F(x)F(x-r) dx + \int F(x-r)^2 dx = 2(1-g(r))$.

In particular, if two points in R^2 are at distance 1, the squared distance between their images in V is 2(1-g(1)).  Note also that g(0) is the square integral of F, which is 1.

What kind of hypersphere encloses all the points F(x-a) in V?  We can just go ahead and take the “center” of our hypersphere to be 0; since |F| = 1, every point in h(R^2) lies in (indeed, lies on) the sphere of radius 1 around the origin.

Hey but remember:  we want to study a unit-distance embedding of R^2 in V.  Right now, h sends unit distances to the distance 2(1-g(1)), whatever that is.  We can fix that by scaling h by the square root of that number.  So now h sends unit distances to unit distances, and its image is enclosed in a hypersphere of radius

2(1-g(1))^{-1}

The more negative g(1) is, the smaller this sphere is, which means the more we can “fold” R^2 into a small space.  Remember, the relationship between hypersphere number and Lovasz theta is

$2t + 1/\theta = 1$

and plugging in the above bound for the hypersphere number, we find that the Lovasz theta number of R^2, and thus the Lovasz theta number of any unit-distance graph in R^2, is at most

1-1/g(1).

So the only question is — what is g(1)?

Well, that depends on what g is.

Which depends on what F is.

Which depends on what f is.

And of course we get to choose what f is, in order to make g(1) as negative as possible.

How do we do this?  Well, here’s the trick.  The function G is not arbitrary; if it were, we could make g(1) whatever we wanted.  It’s not hard to see that G is what’s called a positive definite function on R^2.  And moreover, if G is positive definite, there exists some f giving rise to it.  (Roughly speaking, this is the fact that a positive definite symmetric matrix has a square root.)  So we ask:  if G is a positive definite (radially symmetric) function on R^2, and g(0) = 1, how small can g(1) be?

And now there’s an old theorem of (Wisconsin’s own!) Isaac Schoenberg which helpfully classifies the positive definite functions on R^2; they are precisely the functions G(x) = g(|x|) where g is a mixture of scalings of the Bessel function $J_0$:

$g(r) = \int_0^\infty J_0(ur) A(u)$

for some everywhere nonnegative A(u).  (Actually it’s more correct to say that A is a distribution and we are integrating J_0(ur) against a non-decreasing measure.)

So g(1) can be no smaller than the minimum value of J_0 on [0,infty], and in fact can be exactly that small if you let A become narrowly supported around the minimum argument.  This is basically just taking g to be a rescaled version of J_0 which achieves its minimum at 1.  That minimum value is about -0.4, and so the Lovasz theta for any unit-distance subgraph on the plane is bounded above by a number that’s about 1 + 1/0.4 = 3.5.

To sum up:  I give you a set of points in the plane, I connect every pair that’s at distance 1, and I ask how you can embed that graph in a small hypersphere keeping all the distances 1.  And you say:  “Oh, I know what to do, just assign to each point a the radially symmetrized Bessel function J_0(|x-a|) on R^2, the embedding of your graph in the finite-dimensional space of functions spanned by those Bessel translates will do the trick!”

That is cool!

Remark: Oliveira’s thesis does this for Euclidean space of every dimension (it gets more complicated.)  And I think (using analysis I haven’t really tried to understand) he doesn’t just give an upper bound for the Lovasz number of the plane as I do in this post, he really computes that number on the nose.

Update:  DeCorte, Oliveira, and Vallentin just posted a relevant paper on the arXiv this morning!

Tagged , ,

## What is the Lovasz number of the plane?

There are lots of interesting invariants of a graph which bound its chromatic number!  Most famous is the Lovász number, which asks, roughly:  I attach vectors v_x to each vertex x such that v_x and v_y are orthogonal whenever x and y are adjacent, I try to stuff all those vectors into a small cone, the half-angle of the cone tells you the Lovász number, which is bigger and bigger as the smallest cone gets closer and closer to a hemisphere.

Here’s an equivalent formulation:  If G is a graph and V(G) its vertex set, I try to find a function f: V(G) -> R^d, for some d, such that

|f(x) – f(y)| = 1 whenever x and y are adjacent.

This is called a unit distance embedding, for obvious reasons.

The hypersphere number t(G) of the graph is the radius of the smallest sphere containing a unit distance embedding of G.  Computing t(G) is equivalent to computing the Lovász number, but let’s not worry about that now.  I want to generalize it a bit.  We say a finite sequence (t_1, t_2, t_3, … ,t_d) is big enough for G if there’s a unit-distance embedding of G contained in an ellipsoid with major radii t_1^{1/2}, t_2^{1/2}, .. t_d^{1/2}.  (We could also just consider infinite sequences with all but finitely many terms nonzero, that would be a little cleaner.)

Physically I think of it like this:  the graph is trying to fold itself into Euclidean space and fit into a small region, with the constraint that the edges are rigid and have to stay length 1.

Sometimes it can fold a lot!  Like if it’s bipartite.  Then the graph can totally fold itself down to a line segment of length 1, with all the black vertices going to one end and the white vertices going to the other.  And the big enough sequences are just those with some entry bigger than 1.

On the other hand, if G is a complete graph on k vertices, a unit-distance embedding has to be a simplex, so certainly anything with k of the t_i of size at least 1-1/k is big enough.   (Is that an if and only if?  To know this I’d have to know whether an ellipse containing an equilateral triangle can have a radius shorter than that of the circumcircle.)

Let’s face it, it’s confusing to think about ellipsoids circumscribing embedded graphs, so what about instead we define t(p,G) to be the minimum value of the L^p norm of (t_1, t_2, …) over ellipsoids enclosing a unit-distance embedding of G.

Then a graph has a unit-distance embedding in the plane iff t(0,G) <= 2.  And t(oo,G) is just the hypersphere number again, right?  If G has a k-clique then t(p,G) >= t(p,K_k) for any p, while if G has a k-coloring (i.e. a map to K_k) then t(p,G) <= t(p,K_k) for any n.  In particular, a regular k-simplex with unit edges fits into a sphere of squared radius 1-1/k, so t(oo,G) < 1-1/k.

So… what’s the relation between these invariants?  Is there a graph with t(0,G) = 2 and t(oo,G) > 4/5?  If so, there would be a non-5-colorable unit distance graph in the plane.  But I guess the relationship between these various “norms” feels interesting to me irrespective of any relation to plane-coloring.  What is the max of t(oo,G) with t(0,G)=2?

The intermediate t(p,G) all give functions which upper-bound clique number and lower-bound chromatic number; are any of them interesting?  Are any of them easily calculable, like the Lovász number?

Remarks:

1.  I called this post “What is the Lovász number of the plane?” but the question of “how big can t(oo,G) be if t(0,G)=2”? is more a question about finite subgraphs of the plane and their Lovász numbers.  Another way to ask “What is the Lovász number of the plane” would be to adopt the point of view that the Lovász number of a graph has to do with extremizers on the set of positive semidefinite matrices whose (i,j) entry is nonzero only when i and j are adjacent vertices or i=j.  So there must be some question one could ask about the space of positive semidefinite symmetric kernels K(x,y) on R^2  x R^2 which are supported on the locus ||x-y||=1 and the diagonal, which question would rightly be called “What is the Lovász number of the plane?” But I’m not sure what it is.
2. Having written this, I wonder whether it might be better, rather than thinking about enclosing ellipsoids of a set of points in R^d, just to think of the n points as an nxd matrix X and compute the singular values of X^T X, which would be kind of an “approximating ellipsoid” to the points.  Maybe later I’ll think about what that would measure.  Or you can!

## The chromatic number of the plane is at least 5

That is:  any coloring of the plane with four colors has two points at distance 1 from each other.  So says a paper just posted by Aubrey de Grey.

The idea:  given a set S of points in the plane, its unit distance graph G_S is the graph whose vertices are S and where two points are adjacent if they’re at distance 1 in the plane.  If you can find S such that G_S has chromatic number k, then the chromatic number of the plane is at least k.  And de Grey finds a set of 1,567 points whose unit distance graph can’t be 4-colored.

It’s known that the chromatic number of the plane is at most 7.  Idle question:  is there any chance of a “polynomial method”-style proof that there is no subset S of the plane whose unit distance graph has chromatic number 7?  Such a graph would have a lot of unit distances, and ruling out lots of repetitions of the same distance is something the polynomial method can in principle do.

Though be warned:  as far as I know the polynomial method has generated no improvement so far on older bounds on the unit distance problem (“how many unit distances can there be among pairs drawn from S?”) while it has essentially solved the distinct distance problem (“how few distinct distances can there be among pairs drawn from S?”)

## Which pictures do children draw with straight lines?

Edray Goins gave a great colloquium today about his work on dessins d’enfants.  And in this talk there was a picture that surprised me.  It was one of the ones on the lower right of this poster.  Here, I’ll put in a screen shot:

Let me tell you what you’re looking at.  You are looking for elliptic curves E admitting a Belyi map f: E -> P^1, which is to say a map ramified only over 0,1, and infinity.  For each such map, the blue graph is f^{-1}([0,1]), the preimage of the line segment joining o and 1 in P^1(R).

In four of these cases, the graph is piecewise linear!  I didn’t know there were examples like this.  Don’t know if this is easy, but:  for which Belyi maps (of any genus, not just genus 1) is f^{-1}([0,1]) a union of geodesics?

Tagged ,

## Suriya Gunasekar, optimization geometry, loss minimization as dynamical system

Awesome SILO seminar this week by Suriya Gunasekar of TTI Chicago.  Here’s the idea, as I understand it.  In a classical optimization problem, like linear regression, you are trying to solve a problem which typically has no solution (draw a line that passes through every point in this cloud!) and the challenge is to find the best approximate solution.  Algebraically speaking:  you might be asked to solve

$Ax = b$

for x; but since x may not be in the image of the linear transformation A, you settle for minimizing

$||Ax-b||$

in whatever norm you like (L^2 for standard linear regression.)

In many modern optimization problems, on the other hand, the problem you’re trying to solve may have a lot more degrees of freedom.  Maybe you’re setting up an RNN with lots and lots and lots of parameters.  Or maybe, to bring this down to earth, you’re trying to pass a curve through lots of points but the curve is allowed to have very high degree.  This has the advantage that you can definitely find a curve that passes through all the points.  But it also has the disadvantage that you can definitely find a curve that passes through all the points.  You are likely to overfit!  Your wildly wiggly curve, engineered to exactly fit the data you trained on, is unlikely to generalize well to future data.

Everybody knows about this problem, everybody knows to worry about it.  But here’s the thing.  A lot of modern problems are of this form, and yet the optima we find on training data often do generalize pretty well to test data!  Why?

Make this more formal.  Let’s say for the sake of argument you’re trying to learn a real-valued function F, which you hypothesize is drawn from some giant space X.  (Not necessarily a vector space, just any old space.)  You have N training pairs (x_i, y_i), and a good choice for F might be one such that F(x_i) = y_i.  So you might try to find F such that

$F(x_i) = y_i$

for all i.  But if X is big enough, there will be a whole space of functions F which do the trick!  The solution set to

$F(\mathbf{x}) = \mathbf{y}$

will be some big subspace F_{x,y} of X.  How do you know which of these F’s to pick?

One popular way is to regularize; you decide that some elements of X are just better than others, and choose the point of F_{x,y} that optimizes that objective.  For instance, if you’re curve-fitting, you might try to find, among those curves passing through your N points, the least wiggly one (e.g. the one with the least total curvature.)  Or you might optimize for some combination of hitting the points and non-wiggliness, arriving at a compromise curve that wiggles only mildly and still passes near most of the points.  (The ultimate version of this strategy would be to retreat all the way back to linear regression.)

But it’s not obvious what regularization objective to choose, and maybe trying to optimize that objective is yet another hard computational problem, and so on and so on.  What’s really surprising is that something much simpler often works pretty well.  Namely:  how would you find F such that F(x) = y in the first place?  You would choose some random F in X, then do some version of gradient descent.  Find the direction in the tangent space to X at F that decreases $||F(\mathbf{x})-\mathbf{y}||$ most steeply, perturb F a bit in that direction, lather, rinse, repeat.

If this process converges, it ought to get you somewhere on the solution space F_{x,y}. But where?  And this is really what Gunasekar’s work is about.  Even if your starting F is distributed broadly, the distribution of the spot where gradient descent “lands” on F_{x,y} can be much more sharply focused.  In some cases, it’s concentrated on a single point!  The “likely targets of gradient descent” seem to generalize better to test data, and in some cases Gunasekar et al can prove gradient descent likes to find the points on F_{x,y} which optimize some regularizer.

I was really struck by this outlook.  I have tended to think of function learning as a problem of optimization; how can you effectively minimize the training loss ||F(x)  – y||?  But Gunasekar asks us instead to think about the much richer mathematical structure of the dynamical system of gradient descent on X guided by the loss function.  (Or I should say dynamical systems; gradient descent comes in many flavors.)

The dynamical system has a lot more stuff in it!  Think about iterating a function; knowing the fixed points is one thing, but knowing which fixed points are stable and which aren’t, and knowing which stable points have big basins of attraction, tells you way more.

What’s more, the dynamical system formulation is much more natural for learning problems as they are so often encountered in life, with streaming rather than static training data.  If you are constantly observing more pairs (x_i,y_i), you don’t want to have to start over every second and optimize a new loss function!  But if you take the primary object of study to be, not the loss function, but the dynamical system on the hypothesis space X, new data is no problem; your gradient is just a longer and longer sum with each timestep (or you exponentially deweight the older data, whatever you want my friend, the world is yours.)

Anyway.  Loved this talk.  Maybe this dynamical framework is the way other people are already accustomed to think of it but it was news to me.

Slides for a talk of Gunasekar’s similar to the one she gave here

“Characterizing Implicit Bias in terms of Optimization Geometry” (2018)

“Convergence of Gradient Descent on Separable Data” (2018)

A little googling for gradient descent and dynamical systems shows me that, unsurprisingly, Ben Recht is on this train.

## Farblandia

The job fell to me of giving an overview talk about Benson Farb’s entire career at his birthday conference last fall.  Hard task!  I didn’t cover nearly everything but I think I gave a decent impression of what Farbisme is all about.

Update:  For WordPressreasons, you can’t watch the video within this page, but if you click through to Vimeo you can watch it!

Tagged ,

## Wanlin Li, “Vanishing of hyperelliptic L-functions at the central point”

My Ph.D. student Wanlin Li has posted her first paper!  And it’s very cool.  Here’s the idea.  If chi is a real quadratic Dirichlet character, there’s no reason the special value L(1/2,chi) should vanish; the functional equation doesn’t enforce it, there’s no group whose rank is supposed to be the order of vanishing, etc.  And there’s an old conjecture of Chowla which says the special value never vanishes.  On the very useful principle that what needn’t happen doesn’t happen.

Alexandra Florea (last seen on the blog here)  gave a great seminar here last year about quadratic L-functions over function fields, which gave Wanlin the idea of thinking about Chowla’s conjecture in that setting.  And something interesting developed — it turns out that Chowla’s conjecture is totally false!  OK, well, maybe not totally false.  Let’s put it this way.  If you count quadratic extensions of F_q(t) up to conductor N, Wanlin shows that at least c N^a of the corresponding L-functions vanish at the center of the critical strip.  The exponent a is either 1/2,1/3, or 1/5, depending on q.  But it is never 1.  Which is to say that Wanlin’s theorem leaves open the possibility that o(N) of the first N hyperelliptic L-functions vanishes at the critical point.  In other words, a density form of Chowla’s conjecture over function fields might still be true — in fact, I’d guess it probably is.

The main idea is to use some algebraic geometry.  To say an L-function vanishes at 1/2 is to say some Frobenius eigenvalue which has to have absolute value q^{1/2} is actually equal to q^{1/2}.  In turn, this is telling you that the hyperelliptic curve over F_q whose L-function you’re studying has a map to some fixed elliptic curve.  Well, that’s something you can make happen by physically writing down equations!  Of course you also need a lower bound for the number of distinct quadratic extensions of F_q(t) that arise this way; this is the most delicate part.

I think it’s very interesting to wonder what the truth of the matter is.  I hope I’ll be back in a few months to tell you what new things Wanlin has discovered about it!