Category Archives: math

Soumya Sankar: Proportion of ordinarity in families of curves over finite fields

What’s the chance that a random curve has ordinary Jacobian? You might instinctively say “It must be probability 1” because the non-ordinary locus is a proper closed subvariety of M_g. (This is not obvious by pure thought, at least to me, and I don’t know who first proved it! I imagine you can check it by explicitly exhibiting a curve of each genus with ordinary Jacobian, but I’m not sure this is the best way.)

Anyway, the point is, this instinctive response is wrong! At least it’s wrong if you interpret the question the way I have in mind, which is to ask: given a random curve X of genus g over F_q, with g growing as q stays fixed, is there a limiting probability that X has ordinary Jacobian? And this might not be 1, in the same way that the probability that a random polynomial over F_q is squarefree is not 1, but 1-1/q.

Bryden Cais, David Zureick-Brown and I worked out some heuristic guesses for this problem several years ago, based on the idea that the Dieudonne module for a random curve might be a random Dieudonne module, and then working out in some detail what in the Sam Hill one might mean by “random Dieudonne module.” Then we did some numerical experiments which showed that our heuristic looked basically OK for plane curves of high degree, but pretty badly wrong for hyperelliptic curves of high genus. But there was no family of curves for which one could prove either that our heuristic was right or that it was wrong.

Now there is, thanks to my Ph.D. student Soumya Sankar. Unfortunately, there are still no families of curves for which our heuristics are provably right. But there are now several for which it is provably wrong!

15.7% of Artin-Schreier curves over F_2 (that is: Z/2Z-covers of P^1/F_2) are ordinary. (The heuristic proportion given in my paper with Cais and DZB is about 42%, which matches data drawn from plane curves reasonably well.) The reason Sankar can prove this is because, for Artin-Schreier curves, you can test ordinarity (or, more generally, compute the a-number) in terms of the numerical invariants of the ramification points; the a-number doesn’t care where the ramification points are, which would be a more difficult question.

On the other hand, 0% of Artin-Schreier curves over F are ordinary for any finite field of odd characteristic! What’s going on? It turns out that it’s only in characteristic 2 that the Artin-Schreier locus is irreducible; in larger characteristics, it turns out that the locus has irreducible components whose number grows with genus, and the ordinary curves live on only one of these components. This “explains” the rarity of ordinarity (though this fact alone doesn’t prove that the proportion of ordinarity goes to 0; Sankar does that another way.) Natural question: if you just look at the ordinary component, does the proportion of ordinary curves approach a limit? Sankar shows this proportion is bounded away from 0 in characteristic 3, but in larger characteristics the combinatorics get complicated! (All this stuff, you won’t be surprised to hear, relies on Rachel Pries’s work on the interaction of special loci in M_g with the Newton stratification.)

Sankar also treats the case of superelliptic curves y^n = f(x) in characteristic 2, which turns out to be like that of Artin-Schreier in odd characteristics; a lot of components, only one with ordinary points, probability of ordinarity going to zero.

Really nice paper which raises lots of questions! What about more refined invariants, like the shape of the Newton polygon? What about other families of curves? I’d be particularly interested to know what happens with trigonal curves which (at least in characteristic not 2 or 3, and maybe even then) feel more “generic” to me than curves with extra endomorphisms. Is there any hope for our poor suffering heuristics in a family like that?

Tagged , , , , ,

Large-scale Pareto-optimal topologies, or: how to describe a hexahedron

I got to meet Karen Caswelch, the CEO of Madison startup SciArtSoft last week. The company is based on tech developed by my colleague Krishnan Suresh. When I looked at one of his papers about this stuff I was happy to find there was a lovely piece of classical solid geometry hidden in it!

Here’s the deal. You want to build some component out of metal, which metal is to be contained in a solid block. So you can think of the problem as: you start with a region V in R^3, and your component is going to be some subregion W in R^3. For each choice of W there’s some measure of “compliance” which you want to minimize; maybe it’s fragility, maybe it’s flexibility, I dunno, depends on the problem. (Sidenote: I think lay English speakers would want “compliance” to refer to something you’d like to maximize, but I’m told this usage is standard in engineering.) (Subsidenote: I looked into this and now I get it — compliance literally refers to flexibility; it is the inverse of stiffness, just like in the lay sense. If you’re a doctor you want your patient to comply to their medication schedule, thus bending to outside pressure, but bending to outside pressure is precisely what you do not want your metal widget to do.)

So you want to minimize compliance, but you also want to minimize the weight of your component, which means you want vol(W) to be as small as possible. These goals are in conflict. Little lacy structures are highly compliant.

It turns out you can estimate compliance by breaking W up into a bunch of little hexahedral regions, computing compliance on each one, and summing. For reasons beyond my knowledge you definitely don’t want to restrict to chopping uniformly into cubes. So a priori you have millions and millions of differently shaped hexahedra. And part of the source of Suresh’s speedup is to gather these into approximate congruence classes so you can do a compliance computation for a whole bunch of nearly congruent hexahedra at once. And here’s where the solid geometry comes in; an old theorem of Cauchy tells you that if you know what a convex polyhedron’s 1-skeleton looks like as a graph, and you know the congruence classes of all the faces, you know the polyhedron up to rigid motion. In partiuclar, you can just triangulate each face of the hexahedron with a diagonal, and record the congruence class by 18 numbers, which you can then record in a hash table. You sort the hashes and then you can instantly see your equivalence classes of hexahedra.

(Related: the edge lengths of a tetrahedron determine its volume but the areas of the faces don’t.)

Tagged , , , ,

Small baseball triangles

This all started when CJ asked which three baseball stadiums formed the smallest triangle.  And we agreed it had to be the Brewers, the White Sox, and the Cubs, because Milwaukee and Chicago are really close together.

But it seems like cheating to use two teams in the same city.  The most elegant way to forbid that is to ask the question one league at a time.  Which three American League parks form the smallest triangle?  And what about the National League?

First of all, what does “smallest” mean?  There are lots of choices, but (perhaps inspired by the summer we played a lot of Ingress) we asked for the triangle with the smallest area.  Which means you don’t just want the parks to be close together, you want them to be almost collinear!

I asked on Twitter and got lots of proposed answers.  But it wasn’t obvious to me which, if any, were right, so I worked it out myself!  Seamheads has the longitude and latitude of every major league ballpark past and present in a nice .csv file.  How do you compute the area of a spherical triangle given longitudes and latitudes?  You probably already know that the area is given by the excess over pi of the sum of the angles.  But then you gotta look up a formula for the angles.  Or another way:  Distance on the sphere is standard, and then it turns out that there’s a spherical Heron formula for the area of a spherical triangle given its edgelengths!  I guess it’s clear there’s some formula like that, but it’s cool how Heron-like it looks.  Fifteen lines of Python and you’re ready to go!

So what are the answers?

We were right that Brewers-White Sox-Cubs form the smallest major league triangle.  And the smallest American League triangle is not so surprising:  Red Sox, Yankees, Orioles, forming a shortish line up the Eastern Seaboard.  But for the National League, the smallest triangle isn’t what you might expect!  A good guess, following what happened in the AL, is Mets-Phillies-Nationals.  And that’s actually the second-smallest.  But the smallest National League triangle is formed by the Phillies, the Nationals, and the Atlanta Braves!  Here’s a piece the geodesic path from SunTrust Park in Atlanta to Citizen’s Bank Park in Philly, courtesy of GPSVisualizer:

Not only does it go right through DC, it passes about a mile and a half from Nationals Park!

Another fun surprise is the second-smallest major league triangle:  you’d think it would be another triangle with two teams in the same city, but no!  It’s Baltimore-Cincinnati-St. Louis.  Here’s the geodesic path from Oriole Park at Camden Yards to Busch Stadium:

And here’s a closeup:

The geodesic path runs through the Ohio River, about 300m from the uppermost bleachers at Great American Ball Park.  Wow!

Now here’s a question:  should we find it surprising that the smallest triangles involve teams that are pretty far from each other?  If points are placed at random in a circle (which baseball teams are definitely not) do we expect the smallest-area triangles to have small diameter, or do we expect them to be long and skinny?  It’s the latter!  See this paper: “On Smallest Triangles,” by Grimmet and Janson.  Put down n points at random in the unit circle; the smallest-area triangle will typically have area on order 1/n^3, but will have diameter on order 1.  Should that have been what I expected?

PS:  the largest-area major league triangle is Boston-Miami-SF.  Until MLB expands to Mexico City, that is!

Tagged , , ,

Is the Dedekind sum really a function on SL_2?

Here’s an idle thought I wanted to record before I forgot it.

The Dedekind sum comes up in a bunch of disparate places; it’s how you keep track of the way half-integral weight forms like the eta function aka discriminant to the one-twelfth transforms under SL_2, it shows up in the topology of modular knots, the alternating sum of continued fraction coefficients, etc.  It has a weird definition which I find it hard to get a feel for.  The Dedekind sum also satsfies Rademacher reciprocity:

D(a,b;c) + D(b,c;a) + D(c,a;b) = \frac{1}{12abc}(a^2 + b^2 + c^2 - 3abc)

If that right-hand side looks familiar, it’s because it’s the very same cubic form whose vanishing defines the Markoff numbers!  Here’s a nice way to interpret it.  Suppose A,B,C are matrices with ABC = 1 and

(1/3)Tr A = a

(1/3)Tr B = b

(1/3)Tr C = c

(Why 1/3?  See this post from last month.)

Then

a^2 + b^2 + c^2 - 3abc = (1/9)(2 + \mathbf{Tr}([A,B]))

(see e.g. this paper of Bowditch.)

The well-known invariance of the Markoff form under moves like (a,b,c) -> (a,b,ab-c) now “lifts” to the fact that (the conjugacy class of) [A,B] is unchanged by the action of the mapping class group Gamma(0,4) on the set of triples (A,B,C) with ABC=1.

The Dedekind sum can be thought of as a function on such triples:

D(A,B,C) = D((1/3)Tr A, (1/3) Tr B; (1/3) Tr C).

Is there an alternate definition or characterization of D(A,B,C) which makes Rademacher reciprocity

D(A,B,C) + D(B,C,A) + D(C,A,B) = (1/9)(2 +  \mathbf{Tr}([A,B]))

more manifest?

Tagged , , ,

Why is a Markoff number a third of a trace?

I fell down a rabbit hole this week and found myself thinking about Markoff numbers again.  I blogged about this before when Sarnak lectured here about them.  But I understood one minor point this week that I hadn’t understood then.  Or maybe I understood it then but I forgot.  Which is why I’m blogging this now, so I don’t forget again, or for the first time, as the case may be.

Remember from the last post:  a Markoff number is (1/3)Tr(A), where A is an element of SL_2(Z) obtained by a certain construction.  But why is this an integer?  Isn’t it a weird condition on a matrix to ask that its trace be a multiple of 3?  Where is this congruence coming from?

OK, here’s the idea.  The Markoff story has to do with triples of matrices (A,B,C) in SL_2(Z) with ABC = identity and which generate H, the commutator subgroup of SL_2(Z).  I claim that A, B, and C all have to have trace a multiple of 3!  Why?  Well, this is of course just a statement about triples (A,B,C) of matrices in SL_2(F_3).  But they actually can’t be arbitrary in SL_2(F_3); they lie in the commutator.  SL_2(F_3) is a double cover of A_4 so it has a map to Z/3Z, which is in fact the full abelianization; so the commutator subgroup has order 8 and in fact you can check it’s a quaternion group.  What’s more, if A is central, then A,B, and C = A^{-1}B^{-1} generate a group which is cyclic mod its center, so they can’t generate all of H.  We conclude that A,B, and C are all non-central elements of the quaternion group.  Thus they have exact order 4, and so their eigenvalues are +-i, so their trace is 0.

In other words:  any minimal generating set for the commutator subgroup of SL_2(Z) consists of two matrices whose traces are both multiples of 3.

Tagged , ,

On not staying in your lane

This week I’ve been thinking about some problems outside my usual zone of expertise — namely, questions about the mapping class group and the Johnson kernel.  This has been my week:

  • Three days of trying to prove a cohomology class is nonzero;
  • Then over Thanksgiving I worked out an argument that it was zero and was confused about that for a couple of days because I feel quite deeply that it shouldn’t be zero;
  • This morning I was able to get myself kind of philosophical peace with the class being zero and was working out which refined version of the class might not be zero;
  • This afternoon I was able to find the mistake in my argument that the class was zero so now I hope it’s not zero again.
  • But I still don’t know.

There’s a certain frustration, knowing that I’ve spend a week trying to compute something which some decently large number of mathematicians could probably sit down and just do, because they know their way around this landscape.  But on the other hand, I would never want to give up the part of math research that involves learning new things as if I were a grad student.  It is not the most efficient way, in the short term, to figure out whether this class is zero or not, but I think it probably helps me do math better in a global sense that I spend some of my weeks stumbling around unfamiliar rooms in the dark.  Of course I might just be rationalizing something I enjoy doing.  Even if it’s frustrating.  Man, I hope that class isn’t zero.

Naser Talebizadeh Sardari, Hecke eigenvalues, and Chabauty in the deformation space

Naser Sardari is finishing a postdoc at Wisconsin this year and just gave a beautiful talk about his new paper.  Now Naser thinks of this as a paper about automorphic forms — and it is — but I want to argue that it is also a paper which develops an unexpected new form of the Chabauty method!  As I will now explain.  Tell me if you buy it.

First of all, what does Naser prove?  As the title might suggest, it’s a statement about the multiplicity of Hecke eigenvalues a_p; in this post, we’re just going to talk about the eigenvalue zero.  The Hecke operator T_p acts on the space of weight-k modular forms on Gamma_0(N); how many zero eigenvectors can it have, as k goes to infinity with N,p fixed?  If you believe conjectures of Maeda type, you might expect that the Hecke algebra acts irreducibly on the space S_k(Gamma_0(N)); of course this doesn’t rule out that one particular Hecke operator might have some zeroes, but it should make it seem pretty unlikely.

And indeed, Naser proves that the number of zero eigenvectors is bounded independently of k, and even gives an explicit upper bound. (When the desired value of a_p is nonzero, T_p has finite slope and we can reduce to a problem about modular forms in a single p-adic family; in that context, a uniform bound is easy, and one can even show that the number of such forms of weight <k grows very very very very slowly with k, where each "very" is a log; this is worked out on Frank Calegari’s blog.. On the other hand, as Naser points out below in comments, if you ask about the “Hecke angle” a_p/p^{(k-1)/2}, we don’t know how to get any really good bound in the nonzero case. I think the conjecture is that you always expect finite multiplicity in either setting even if you range over all k.)

What I find most striking is the method of proof and its similarity to the Chabauty method!  Let me explain.  The basic idea of Naser’s paper is to set this up in the language of deformation theory, with the goal of bounding the number of weight-k p-adic Galois representations rho which could be the representations attached to weight-k forms with T_p = 0.

We can pin down the possible reductions mod p of such a form to a finite number of possibilities, and this number is independent of k, so let’s fix a residual representation rhobar once and for all.

The argument takes place in R_loc, the ring of deformations of rhobar|G_{Q_p}.  And when I say “the ring of deformations” I mean “the ring of deformations subject to whatever conditions are important,” I’m just drawing a cartoon here.  Anyway, R_loc is some big p-adic power series ring; or we can think of the p-adic affine space Spec R_loc, whose Z_p-points we can think of as the space of deformations of rhobar to p-adic local representations.  This turns out to be 5-dimensional in Naser’s case.

Inside Spec R_loc, we have the space of local representations which extend to global ones; let’s call this locus Spec R_glob.  This is still a p-adic manifold but it’s cut out by global arithmetic conditions and its dimension will be given by some computation in Galois cohomology over Q; it turns out to be 3.

But also inside Spec R_loc, we have a submanifold Z cut out by the condition that a_p is not just 0 mod p, it is 0 on the nose, and that the determinant is the kth power of cyclotomic for the particular k-th power you have in mind.  This manifold, which is 2-dimensional, is something you could define without ever knowing there was such a thing as Q; it’s just some closed locus in the deformation space of rhobar|Gal(Q_p).

But the restriction of rho to Gal(Q_p) is a point psi of R_loc which has to lie in both these two spaces, the local one which expresses the condition “psi looks like the representation of Gal(Q_P) attached to a weight-k modular form with a_p = 0” and the global one which expresses the condition “psi is the restriction to Gal(Q_p) of representation of Gal(Q) unramified away from some specified set of primes.”  So psi lies in the intersection of the 3-dimensional locus and the 2-dimensional locus in 5-space, and the miracle is that you can prove this intersection is transverse, which means it consists of a finite set of points, and what’s more, it is a set of points whose cardinality you can explicitly bound!

If this sounds familiar, it’s because it’s just like Chabauty.  There, you have a curve C and its Jacobian J.  The analogue of R_loc is J(Q_p), or rather let’s say a neighborhood of the identity in J(Q_p) which looks like affine space Q_p^g.

The analogue of R_glob is (the p-adic closure of) J(Q), which is a proper subspace of dimension r, where r is the rank of J(Q), something you can compute or at least bound by Galois cohomology over Q.  (Of course it can’t be a proper subspace of dimension r if r >= g, which is why Chabauty doesn’t work in that case!)

The analogue of Z is C(Q_p); this is something defined purely p-adically, a locus you could talk about even if you had no idea your C/Q_p were secretly the local manifestation of a curve over Q.

And any rational point of C(Q), considered as a point in J(Q_p), has to lie in both C(Q_p) and J(Q), whose dimensions 1 and at most g-1, and once again the key technical tool is that this intersection can be shown to be transverse, whence finite, so C(Q) is finite and you have Mordell’s conjecture in the case r < g.  And, as Coleman observed decades after Chabauty, this method even allows you to get an explicit bound on the number of points of C(Q), though not an effective way to compute them.

I think this is a very cool confluence indeed!  In the last ten years we've seen a huge amount of work refining Chabauty; Matt Baker discusses some of it on his blog, and then there’s the whole nonabelian Chabauty direction launched by Minhyong Kim and pushed forward by Jen Balakrishnan and Netan Dogra and many others.  Are there other situations in which we can get meaningful results from “deformation-theoretic Chabauty,” and are the new technical advances in Chabauty methods relevant in this context?

Tagged , , , ,

A type of unproductivity

There is a certain very special type of unproductivity that I have experienced only in math.  You are working on something and you feel almost certain your strategy is not going to work.  In fact, it is more likely than not that your strategy is not even a new strategy, but a variant on something you’ve already tried unsuccessfully — or maybe not even actually a variant, but just a rephrasing you’ve fooled yourself into thinking is a variant.  So you are not sure whether you are actually working on something at all.  In fact, you doubt it.

And yet you keep going!  Because what if it works?

 

Tagged ,

Heights on stacks and heights on vector bundles over stacks

I’ve been giving a bunch of talks about work with Matt Satriano and David Zureick-Brown on the problem of defining the “height” of a rational point on a stack.  The abstract usually looks something like this:

Here are two popular questions in number theory:

1.  How many degree-d number fields are there with discriminant at most X?
2.  How many rational points are there on a cubic surface with height at most X?

Our expectations about the first question are governed by Malle’s conjecture; about the second, by the Batyrev-Manin conjecture.  The forms of the conjectures are very similar, predicting in both cases an asymptotic of the form c X^a (log X)^b, and this is no coincidence: I will explain how to think of both questions in a common framework, that of counting points of bounded height on an algebraic stack.  A serious obstacle is that there is no definition of the height of a rational point on a stack.  I will propose a definition and try to convince you it’s the right one.  If there’s time, I’ll also argue that when we talk about heights with respect to a line bundle we have always secretly meant “vector bundle,” or should have.

(joint work with Matt Satriano and David Zureick-Brown)

Frank Calegari asked a good question after I talked about this at Mazur’s birthday conference.  And other people have asked me the same question!  So I thought I’d write about it here on the blog.

An actual (somewhat tangential) math question about your talk: when it comes (going back to the original problem) of extensions with Galois group G, there is (as you well know) a natural cover \mathbf{A}^n/G \rightarrow \cdot/G, and the source has a nice smooth unirational open subscheme which is much less stacky object and could possibly still be used to count G-extensions (or rather, to count G-polynomials). How does this picture interact (if at all) with your talk or the Malle conjecture more generally?

Here’s an answer.  Classically, how do we count degree-n extensions of Q?  We count monic degree-n polynomials with bounded coefficients; that is, we count integral points of bounded height on A^n / S_n, which is isomorphic to A^n, the space of monic degree-n polynomials.

Now A^n / S_n is the total space of a vector bundle over the stack B(S_n).  So you might say that what we’re doing is using “points on the total space of a vector bundle E/X as a proxy for points on X.”  And when you put it that way, you see that it’s what people who work on rational points do all the time!  What do we do when we count rational points on P^1?  We count pairs of coprime integers in a box; in other words, we count integral points on A^2 – 0, which is the total space (sans zero section) of a line bundle on P^1.  More generally, in many cases where people can prove the Batyrev-Manin conjecture for a variety X, it’s precisely by means of passing to a “universal torsor” — the total space of a vector bundle (or an torus bundle sitting in a vector bundle) over X.  Sometimes you can use this technique to get actual asymptotics for rational points on X; other times you just get bounds; if you can prove that, for any x in X(Q), there is a point on the fiber E_x whose height is at most F(height(x)) for some reasonable function F, you can parlay upper bounds for points on E into upper bounds for points on X.  In the classical case, this is the part where we argue that (by Minkowski) a number field with discriminant D contains an algebraic integer whose characteristic polynomial has coefficients bounded in terms of D.

So coming back to the original question:  how do you know which vector bundle on BG is a good one to think about?  Actually, this is far from clear!  The very first thing I ever wrote about counting number fields, my first paper with Akshay, gave new upper bounds for the number of degree-n extensions, by counting points on

(\mathbf{A}^n)^m / S_n

where S_n acts diagonally.  In other words, we used a different vector bundle on B(S_n) than the “standard” one, and showed that by optimizing m (and being careful about stripping out loci playing the role of accumulating subvarieties) we could get better upper bounds than the ones coming from counting polynomials.

So apparently I’ve been counting points on vector bundles on stacks all along…!

Tagged , , ,

The Lovasz number of the plane is about 3.48287

As seen in this comment on Polymath and explicated further in Fernando de Oliveira Filho’s thesis, section 4.4.

I actually spent much of today thinking about this so let me try to explain it in a down-to-earth way, because it involved me thinking about Bessel functions for the first time ever, surely a life event worthy of recording.

So here’s what we’re going to do.  As I mentioned last week, you can express this problem as follows:  suppose you have a map h: R^2 -> V, for some normed vector space V, which is a unit-distance embedding; that is, if |x-x’|_{R^2} = 1, then |h(x)-h(x’)|_V = 1.  (We don’t ask that h is an isometry, only that it preserves the distance-1 set.)

Then let t be the radius of the smallest hypersphere in V containing h(R^2).

Then any graph embeddable in R^2 with all edges of length 1 is sent to a unit-distance graph in V contained in the hyperplane of radius t; this turns out to be equivalent to saying the Lovasz number of G (ok, really I mean the Lovasz number of the complement of G) is at most 1/(1-2t).  So we want to show that t is bounded below 1, is the point.  Or rather:  we can find a V and a map from R^2 to V to make this the case.

So here’s one!  Let V be the space of L^2 functions on R^2 with the usual inner product.  Choose a square-integrable function F on R^2 — in fact let’s normalize to make F^2 integrate to 1 — and for each a in R^2 we let h(a) be the function F(x-a).

We want the distance between F(x-a) and F(x-b) to be the same for every pair of points at distance 1 from each other; the easiest way to arrange that is to insist that F(x) be a radially symmetric function F(x) = f(|x|); then it’s easy to see that the distance between F(x-a) and F(x-b) in V is a function G(a-b) which depends only on |a-b|.  We write

g(r) = \int_{\mathbf{R}^2} F(x)F(x-r) dx

so that the squared distance between F(x) and F(x-r) is

\int F(x)^2 dx - 2 \int F(x)F(x-r) dx + \int F(x-r)^2 dx = 2(1-g(r)).

In particular, if two points in R^2 are at distance 1, the squared distance between their images in V is 2(1-g(1)).  Note also that g(0) is the square integral of F, which is 1.

What kind of hypersphere encloses all the points F(x-a) in V?  We can just go ahead and take the “center” of our hypersphere to be 0; since |F| = 1, every point in h(R^2) lies in (indeed, lies on) the sphere of radius 1 around the origin.

Hey but remember:  we want to study a unit-distance embedding of R^2 in V.  Right now, h sends unit distances to the distance 2(1-g(1)), whatever that is.  We can fix that by scaling h by the square root of that number.  So now h sends unit distances to unit distances, and its image is enclosed in a hypersphere of radius

2(1-g(1))^{-1}

The more negative g(1) is, the smaller this sphere is, which means the more we can “fold” R^2 into a small space.  Remember, the relationship between hypersphere number and Lovasz theta is

2t + 1/\theta = 1

and plugging in the above bound for the hypersphere number, we find that the Lovasz theta number of R^2, and thus the Lovasz theta number of any unit-distance graph in R^2, is at most

1-1/g(1).

So the only question is — what is g(1)?

Well, that depends on what g is.

Which depends on what F is.

Which depends on what f is.

And of course we get to choose what f is, in order to make g(1) as negative as possible.

How do we do this?  Well, here’s the trick.  The function G is not arbitrary; if it were, we could make g(1) whatever we wanted.  It’s not hard to see that G is what’s called a positive definite function on R^2.  And moreover, if G is positive definite, there exists some f giving rise to it.  (Roughly speaking, this is the fact that a positive definite symmetric matrix has a square root.)  So we ask:  if G is a positive definite (radially symmetric) function on R^2, and g(0) = 1, how small can g(1) be?

And now there’s an old theorem of (Wisconsin’s own!) Isaac Schoenberg which helpfully classifies the positive definite functions on R^2; they are precisely the functions G(x) = g(|x|) where g is a mixture of scalings of the Bessel function $J_0$:

g(r) = \int_0^\infty J_0(ur) A(u)

for some everywhere nonnegative A(u).  (Actually it’s more correct to say that A is a distribution and we are integrating J_0(ur) against a non-decreasing measure.)

So g(1) can be no smaller than the minimum value of J_0 on [0,infty], and in fact can be exactly that small if you let A become narrowly supported around the minimum argument.  This is basically just taking g to be a rescaled version of J_0 which achieves its minimum at 1.  That minimum value is about -0.4, and so the Lovasz theta for any unit-distance subgraph on the plane is bounded above by a number that’s about 1 + 1/0.4 = 3.5.

To sum up:  I give you a set of points in the plane, I connect every pair that’s at distance 1, and I ask how you can embed that graph in a small hypersphere keeping all the distances 1.  And you say:  “Oh, I know what to do, just assign to each point a the radially symmetrized Bessel function J_0(|x-a|) on R^2, the embedding of your graph in the finite-dimensional space of functions spanned by those Bessel translates will do the trick!”

That is cool!

Remark: Oliveira’s thesis does this for Euclidean space of every dimension (it gets more complicated.)  And I think (using analysis I haven’t really tried to understand) he doesn’t just give an upper bound for the Lovasz number of the plane as I do in this post, he really computes that number on the nose.

Update:  DeCorte, Oliveira, and Vallentin just posted a relevant paper on the arXiv this morning!

Tagged , ,
%d bloggers like this: