## Large-scale Pareto-optimal topologies, or: how to describe a hexahedron

I got to meet Karen Caswelch, the CEO of Madison startup SciArtSoft last week. The company is based on tech developed by my colleague Krishnan Suresh. When I looked at one of his papers about this stuff I was happy to find there was a lovely piece of classical solid geometry hidden in it!

Here’s the deal. You want to build some component out of metal, which metal is to be contained in a solid block. So you can think of the problem as: you start with a region V in R^3, and your component is going to be some subregion W in R^3. For each choice of W there’s some measure of “compliance” which you want to minimize; maybe it’s fragility, maybe it’s flexibility, I dunno, depends on the problem. (Sidenote: I think lay English speakers would want “compliance” to refer to something you’d like to maximize, but I’m told this usage is standard in engineering.) (Subsidenote: I looked into this and now I get it — compliance literally refers to flexibility; it is the inverse of stiffness, just like in the lay sense. If you’re a doctor you want your patient to comply to their medication schedule, thus bending to outside pressure, but bending to outside pressure is precisely what you do not want your metal widget to do.)

So you want to minimize compliance, but you also want to minimize the weight of your component, which means you want vol(W) to be as small as possible. These goals are in conflict. Little lacy structures are highly compliant.

It turns out you can estimate compliance by breaking W up into a bunch of little hexahedral regions, computing compliance on each one, and summing. For reasons beyond my knowledge you definitely don’t want to restrict to chopping uniformly into cubes. So a priori you have millions and millions of differently shaped hexahedra. And part of the source of Suresh’s speedup is to gather these into approximate congruence classes so you can do a compliance computation for a whole bunch of nearly congruent hexahedra at once. And here’s where the solid geometry comes in; an old theorem of Cauchy tells you that if you know what a convex polyhedron’s 1-skeleton looks like as a graph, and you know the congruence classes of all the faces, you know the polyhedron up to rigid motion. In partiuclar, you can just triangulate each face of the hexahedron with a diagonal, and record the congruence class by 18 numbers, which you can then record in a hash table. You sort the hashes and then you can instantly see your equivalence classes of hexahedra.

## Small baseball triangles

This all started when CJ asked which three baseball stadiums formed the smallest triangle.  And we agreed it had to be the Brewers, the White Sox, and the Cubs, because Milwaukee and Chicago are really close together.

But it seems like cheating to use two teams in the same city.  The most elegant way to forbid that is to ask the question one league at a time.  Which three American League parks form the smallest triangle?  And what about the National League?

First of all, what does “smallest” mean?  There are lots of choices, but (perhaps inspired by the summer we played a lot of Ingress) we asked for the triangle with the smallest area.  Which means you don’t just want the parks to be close together, you want them to be almost collinear!

I asked on Twitter and got lots of proposed answers.  But it wasn’t obvious to me which, if any, were right, so I worked it out myself!  Seamheads has the longitude and latitude of every major league ballpark past and present in a nice .csv file.  How do you compute the area of a spherical triangle given longitudes and latitudes?  You probably already know that the area is given by the excess over pi of the sum of the angles.  But then you gotta look up a formula for the angles.  Or another way:  Distance on the sphere is standard, and then it turns out that there’s a spherical Heron formula for the area of a spherical triangle given its edgelengths!  I guess it’s clear there’s some formula like that, but it’s cool how Heron-like it looks.  Fifteen lines of Python and you’re ready to go!

We were right that Brewers-White Sox-Cubs form the smallest major league triangle.  And the smallest American League triangle is not so surprising:  Red Sox, Yankees, Orioles, forming a shortish line up the Eastern Seaboard.  But for the National League, the smallest triangle isn’t what you might expect!  A good guess, following what happened in the AL, is Mets-Phillies-Nationals.  And that’s actually the second-smallest.  But the smallest National League triangle is formed by the Phillies, the Nationals, and the Atlanta Braves!  Here’s a piece the geodesic path from SunTrust Park in Atlanta to Citizen’s Bank Park in Philly, courtesy of GPSVisualizer:

Not only does it go right through DC, it passes about a mile and a half from Nationals Park!

Another fun surprise is the second-smallest major league triangle:  you’d think it would be another triangle with two teams in the same city, but no!  It’s Baltimore-Cincinnati-St. Louis.  Here’s the geodesic path from Oriole Park at Camden Yards to Busch Stadium:

And here’s a closeup:

The geodesic path runs through the Ohio River, about 300m from the uppermost bleachers at Great American Ball Park.  Wow!

Now here’s a question:  should we find it surprising that the smallest triangles involve teams that are pretty far from each other?  If points are placed at random in a circle (which baseball teams are definitely not) do we expect the smallest-area triangles to have small diameter, or do we expect them to be long and skinny?  It’s the latter!  See this paper: “On Smallest Triangles,” by Grimmet and Janson.  Put down n points at random in the unit circle; the smallest-area triangle will typically have area on order 1/n^3, but will have diameter on order 1.  Should that have been what I expected?

PS:  the largest-area major league triangle is Boston-Miami-SF.  Until MLB expands to Mexico City, that is!

## Is the Dedekind sum really a function on SL_2?

Here’s an idle thought I wanted to record before I forgot it.

The Dedekind sum comes up in a bunch of disparate places; it’s how you keep track of the way half-integral weight forms like the eta function aka discriminant to the one-twelfth transforms under SL_2, it shows up in the topology of modular knots, the alternating sum of continued fraction coefficients, etc.  It has a weird definition which I find it hard to get a feel for.  The Dedekind sum also satsfies Rademacher reciprocity:

$D(a,b;c) + D(b,c;a) + D(c,a;b) = \frac{1}{12abc}(a^2 + b^2 + c^2 - 3abc)$

If that right-hand side looks familiar, it’s because it’s the very same cubic form whose vanishing defines the Markoff numbers!  Here’s a nice way to interpret it.  Suppose A,B,C are matrices with ABC = 1 and

(1/3)Tr A = a

(1/3)Tr B = b

(1/3)Tr C = c

(Why 1/3?  See this post from last month.)

Then

$a^2 + b^2 + c^2 - 3abc = (1/9)(2 + \mathbf{Tr}([A,B]))$

(see e.g. this paper of Bowditch.)

The well-known invariance of the Markoff form under moves like (a,b,c) -> (a,b,ab-c) now “lifts” to the fact that (the conjugacy class of) [A,B] is unchanged by the action of the mapping class group Gamma(0,4) on the set of triples (A,B,C) with ABC=1.

The Dedekind sum can be thought of as a function on such triples:

D(A,B,C) = D((1/3)Tr A, (1/3) Tr B; (1/3) Tr C).

Is there an alternate definition or characterization of D(A,B,C) which makes Rademacher reciprocity

$D(A,B,C) + D(B,C,A) + D(C,A,B) = (1/9)(2 + \mathbf{Tr}([A,B]))$

more manifest?

## Why is a Markoff number a third of a trace?

I fell down a rabbit hole this week and found myself thinking about Markoff numbers again.  I blogged about this before when Sarnak lectured here about them.  But I understood one minor point this week that I hadn’t understood then.  Or maybe I understood it then but I forgot.  Which is why I’m blogging this now, so I don’t forget again, or for the first time, as the case may be.

Remember from the last post:  a Markoff number is (1/3)Tr(A), where A is an element of SL_2(Z) obtained by a certain construction.  But why is this an integer?  Isn’t it a weird condition on a matrix to ask that its trace be a multiple of 3?  Where is this congruence coming from?

OK, here’s the idea.  The Markoff story has to do with triples of matrices (A,B,C) in SL_2(Z) with ABC = identity and which generate H, the commutator subgroup of SL_2(Z).  I claim that A, B, and C all have to have trace a multiple of 3!  Why?  Well, this is of course just a statement about triples (A,B,C) of matrices in SL_2(F_3).  But they actually can’t be arbitrary in SL_2(F_3); they lie in the commutator.  SL_2(F_3) is a double cover of A_4 so it has a map to Z/3Z, which is in fact the full abelianization; so the commutator subgroup has order 8 and in fact you can check it’s a quaternion group.  What’s more, if A is central, then A,B, and C = A^{-1}B^{-1} generate a group which is cyclic mod its center, so they can’t generate all of H.  We conclude that A,B, and C are all non-central elements of the quaternion group.  Thus they have exact order 4, and so their eigenvalues are +-i, so their trace is 0.

In other words:  any minimal generating set for the commutator subgroup of SL_2(Z) consists of two matrices whose traces are both multiples of 3.

Tagged , ,

## On not staying in your lane

This week I’ve been thinking about some problems outside my usual zone of expertise — namely, questions about the mapping class group and the Johnson kernel.  This has been my week:

• Three days of trying to prove a cohomology class is nonzero;
• Then over Thanksgiving I worked out an argument that it was zero and was confused about that for a couple of days because I feel quite deeply that it shouldn’t be zero;
• This morning I was able to get myself kind of philosophical peace with the class being zero and was working out which refined version of the class might not be zero;
• This afternoon I was able to find the mistake in my argument that the class was zero so now I hope it’s not zero again.
• But I still don’t know.

There’s a certain frustration, knowing that I’ve spend a week trying to compute something which some decently large number of mathematicians could probably sit down and just do, because they know their way around this landscape.  But on the other hand, I would never want to give up the part of math research that involves learning new things as if I were a grad student.  It is not the most efficient way, in the short term, to figure out whether this class is zero or not, but I think it probably helps me do math better in a global sense that I spend some of my weeks stumbling around unfamiliar rooms in the dark.  Of course I might just be rationalizing something I enjoy doing.  Even if it’s frustrating.  Man, I hope that class isn’t zero.

## Naser Talebizadeh Sardari, Hecke eigenvalues, and Chabauty in the deformation space

Naser Sardari is finishing a postdoc at Wisconsin this year and just gave a beautiful talk about his new paper.  Now Naser thinks of this as a paper about automorphic forms — and it is — but I want to argue that it is also a paper which develops an unexpected new form of the Chabauty method!  As I will now explain.  Tell me if you buy it.

First of all, what does Naser prove?  As the title might suggest, it’s a statement about the multiplicity of Hecke eigenvalues a_p; in this post, we’re just going to talk about the eigenvalue zero.  The Hecke operator T_p acts on the space of weight-k modular forms on Gamma_0(N); how many zero eigenvectors can it have, as k goes to infinity with N,p fixed?  If you believe conjectures of Maeda type, you might expect that the Hecke algebra acts irreducibly on the space S_k(Gamma_0(N)); of course this doesn’t rule out that one particular Hecke operator might have some zeroes, but it should make it seem pretty unlikely.

And indeed, Naser proves that the number of zero eigenvectors is bounded independently of k, and even gives an explicit upper bound. (When the desired value of a_p is nonzero, T_p has finite slope and we can reduce to a problem about modular forms in a single p-adic family; in that context, a uniform bound is easy, and one can even show that the number of such forms of weight <k grows very very very very slowly with k, where each "very" is a log; this is worked out on Frank Calegari’s blog.. On the other hand, as Naser points out below in comments, if you ask about the “Hecke angle” a_p/p^{(k-1)/2}, we don’t know how to get any really good bound in the nonzero case. I think the conjecture is that you always expect finite multiplicity in either setting even if you range over all k.)

What I find most striking is the method of proof and its similarity to the Chabauty method!  Let me explain.  The basic idea of Naser’s paper is to set this up in the language of deformation theory, with the goal of bounding the number of weight-k p-adic Galois representations rho which could be the representations attached to weight-k forms with T_p = 0.

We can pin down the possible reductions mod p of such a form to a finite number of possibilities, and this number is independent of k, so let’s fix a residual representation rhobar once and for all.

The argument takes place in R_loc, the ring of deformations of rhobar|G_{Q_p}.  And when I say “the ring of deformations” I mean “the ring of deformations subject to whatever conditions are important,” I’m just drawing a cartoon here.  Anyway, R_loc is some big p-adic power series ring; or we can think of the p-adic affine space Spec R_loc, whose Z_p-points we can think of as the space of deformations of rhobar to p-adic local representations.  This turns out to be 5-dimensional in Naser’s case.

Inside Spec R_loc, we have the space of local representations which extend to global ones; let’s call this locus Spec R_glob.  This is still a p-adic manifold but it’s cut out by global arithmetic conditions and its dimension will be given by some computation in Galois cohomology over Q; it turns out to be 3.

But also inside Spec R_loc, we have a submanifold Z cut out by the condition that a_p is not just 0 mod p, it is 0 on the nose, and that the determinant is the kth power of cyclotomic for the particular k-th power you have in mind.  This manifold, which is 2-dimensional, is something you could define without ever knowing there was such a thing as Q; it’s just some closed locus in the deformation space of rhobar|Gal(Q_p).

But the restriction of rho to Gal(Q_p) is a point psi of R_loc which has to lie in both these two spaces, the local one which expresses the condition “psi looks like the representation of Gal(Q_P) attached to a weight-k modular form with a_p = 0” and the global one which expresses the condition “psi is the restriction to Gal(Q_p) of representation of Gal(Q) unramified away from some specified set of primes.”  So psi lies in the intersection of the 3-dimensional locus and the 2-dimensional locus in 5-space, and the miracle is that you can prove this intersection is transverse, which means it consists of a finite set of points, and what’s more, it is a set of points whose cardinality you can explicitly bound!

If this sounds familiar, it’s because it’s just like Chabauty.  There, you have a curve C and its Jacobian J.  The analogue of R_loc is J(Q_p), or rather let’s say a neighborhood of the identity in J(Q_p) which looks like affine space Q_p^g.

The analogue of R_glob is (the p-adic closure of) J(Q), which is a proper subspace of dimension r, where r is the rank of J(Q), something you can compute or at least bound by Galois cohomology over Q.  (Of course it can’t be a proper subspace of dimension r if r >= g, which is why Chabauty doesn’t work in that case!)

The analogue of Z is C(Q_p); this is something defined purely p-adically, a locus you could talk about even if you had no idea your C/Q_p were secretly the local manifestation of a curve over Q.

And any rational point of C(Q), considered as a point in J(Q_p), has to lie in both C(Q_p) and J(Q), whose dimensions 1 and at most g-1, and once again the key technical tool is that this intersection can be shown to be transverse, whence finite, so C(Q) is finite and you have Mordell’s conjecture in the case r < g.  And, as Coleman observed decades after Chabauty, this method even allows you to get an explicit bound on the number of points of C(Q), though not an effective way to compute them.

I think this is a very cool confluence indeed!  In the last ten years we've seen a huge amount of work refining Chabauty; Matt Baker discusses some of it on his blog, and then there’s the whole nonabelian Chabauty direction launched by Minhyong Kim and pushed forward by Jen Balakrishnan and Netan Dogra and many others.  Are there other situations in which we can get meaningful results from “deformation-theoretic Chabauty,” and are the new technical advances in Chabauty methods relevant in this context?

## A type of unproductivity

There is a certain very special type of unproductivity that I have experienced only in math.  You are working on something and you feel almost certain your strategy is not going to work.  In fact, it is more likely than not that your strategy is not even a new strategy, but a variant on something you’ve already tried unsuccessfully — or maybe not even actually a variant, but just a rephrasing you’ve fooled yourself into thinking is a variant.  So you are not sure whether you are actually working on something at all.  In fact, you doubt it.

And yet you keep going!  Because what if it works?

Tagged ,

## Heights on stacks and heights on vector bundles over stacks

I’ve been giving a bunch of talks about work with Matt Satriano and David Zureick-Brown on the problem of defining the “height” of a rational point on a stack.  The abstract usually looks something like this:

Here are two popular questions in number theory:

1.  How many degree-d number fields are there with discriminant at most X?
2.  How many rational points are there on a cubic surface with height at most X?

Our expectations about the first question are governed by Malle’s conjecture; about the second, by the Batyrev-Manin conjecture.  The forms of the conjectures are very similar, predicting in both cases an asymptotic of the form c X^a (log X)^b, and this is no coincidence: I will explain how to think of both questions in a common framework, that of counting points of bounded height on an algebraic stack.  A serious obstacle is that there is no definition of the height of a rational point on a stack.  I will propose a definition and try to convince you it’s the right one.  If there’s time, I’ll also argue that when we talk about heights with respect to a line bundle we have always secretly meant “vector bundle,” or should have.

(joint work with Matt Satriano and David Zureick-Brown)

Frank Calegari asked a good question after I talked about this at Mazur’s birthday conference.  And other people have asked me the same question!  So I thought I’d write about it here on the blog.

An actual (somewhat tangential) math question about your talk: when it comes (going back to the original problem) of extensions with Galois group G, there is (as you well know) a natural cover $\mathbf{A}^n/G \rightarrow \cdot/G,$ and the source has a nice smooth unirational open subscheme which is much less stacky object and could possibly still be used to count G-extensions (or rather, to count G-polynomials). How does this picture interact (if at all) with your talk or the Malle conjecture more generally?

Here’s an answer.  Classically, how do we count degree-n extensions of Q?  We count monic degree-n polynomials with bounded coefficients; that is, we count integral points of bounded height on A^n / S_n, which is isomorphic to A^n, the space of monic degree-n polynomials.

Now A^n / S_n is the total space of a vector bundle over the stack B(S_n).  So you might say that what we’re doing is using “points on the total space of a vector bundle E/X as a proxy for points on X.”  And when you put it that way, you see that it’s what people who work on rational points do all the time!  What do we do when we count rational points on P^1?  We count pairs of coprime integers in a box; in other words, we count integral points on A^2 – 0, which is the total space (sans zero section) of a line bundle on P^1.  More generally, in many cases where people can prove the Batyrev-Manin conjecture for a variety X, it’s precisely by means of passing to a “universal torsor” — the total space of a vector bundle (or an torus bundle sitting in a vector bundle) over X.  Sometimes you can use this technique to get actual asymptotics for rational points on X; other times you just get bounds; if you can prove that, for any x in X(Q), there is a point on the fiber E_x whose height is at most F(height(x)) for some reasonable function F, you can parlay upper bounds for points on E into upper bounds for points on X.  In the classical case, this is the part where we argue that (by Minkowski) a number field with discriminant D contains an algebraic integer whose characteristic polynomial has coefficients bounded in terms of D.

So coming back to the original question:  how do you know which vector bundle on BG is a good one to think about?  Actually, this is far from clear!  The very first thing I ever wrote about counting number fields, my first paper with Akshay, gave new upper bounds for the number of degree-n extensions, by counting points on

$(\mathbf{A}^n)^m / S_n$

where S_n acts diagonally.  In other words, we used a different vector bundle on B(S_n) than the “standard” one, and showed that by optimizing m (and being careful about stripping out loci playing the role of accumulating subvarieties) we could get better upper bounds than the ones coming from counting polynomials.

So apparently I’ve been counting points on vector bundles on stacks all along…!

## The Lovasz number of the plane is about 3.48287

As seen in this comment on Polymath and explicated further in Fernando de Oliveira Filho’s thesis, section 4.4.

I actually spent much of today thinking about this so let me try to explain it in a down-to-earth way, because it involved me thinking about Bessel functions for the first time ever, surely a life event worthy of recording.

So here’s what we’re going to do.  As I mentioned last week, you can express this problem as follows:  suppose you have a map h: R^2 -> V, for some normed vector space V, which is a unit-distance embedding; that is, if |x-x’|_{R^2} = 1, then |h(x)-h(x’)|_V = 1.  (We don’t ask that h is an isometry, only that it preserves the distance-1 set.)

Then let t be the radius of the smallest hypersphere in V containing h(R^2).

Then any graph embeddable in R^2 with all edges of length 1 is sent to a unit-distance graph in V contained in the hyperplane of radius t; this turns out to be equivalent to saying the Lovasz number of G (ok, really I mean the Lovasz number of the complement of G) is at most 1/(1-2t).  So we want to show that t is bounded below 1, is the point.  Or rather:  we can find a V and a map from R^2 to V to make this the case.

So here’s one!  Let V be the space of L^2 functions on R^2 with the usual inner product.  Choose a square-integrable function F on R^2 — in fact let’s normalize to make F^2 integrate to 1 — and for each a in R^2 we let h(a) be the function F(x-a).

We want the distance between F(x-a) and F(x-b) to be the same for every pair of points at distance 1 from each other; the easiest way to arrange that is to insist that F(x) be a radially symmetric function F(x) = f(|x|); then it’s easy to see that the distance between F(x-a) and F(x-b) in V is a function G(a-b) which depends only on |a-b|.  We write

$g(r) = \int_{\mathbf{R}^2} F(x)F(x-r) dx$

so that the squared distance between F(x) and F(x-r) is

$\int F(x)^2 dx - 2 \int F(x)F(x-r) dx + \int F(x-r)^2 dx = 2(1-g(r))$.

In particular, if two points in R^2 are at distance 1, the squared distance between their images in V is 2(1-g(1)).  Note also that g(0) is the square integral of F, which is 1.

What kind of hypersphere encloses all the points F(x-a) in V?  We can just go ahead and take the “center” of our hypersphere to be 0; since |F| = 1, every point in h(R^2) lies in (indeed, lies on) the sphere of radius 1 around the origin.

Hey but remember:  we want to study a unit-distance embedding of R^2 in V.  Right now, h sends unit distances to the distance 2(1-g(1)), whatever that is.  We can fix that by scaling h by the square root of that number.  So now h sends unit distances to unit distances, and its image is enclosed in a hypersphere of radius

2(1-g(1))^{-1}

The more negative g(1) is, the smaller this sphere is, which means the more we can “fold” R^2 into a small space.  Remember, the relationship between hypersphere number and Lovasz theta is

$2t + 1/\theta = 1$

and plugging in the above bound for the hypersphere number, we find that the Lovasz theta number of R^2, and thus the Lovasz theta number of any unit-distance graph in R^2, is at most

1-1/g(1).

So the only question is — what is g(1)?

Well, that depends on what g is.

Which depends on what F is.

Which depends on what f is.

And of course we get to choose what f is, in order to make g(1) as negative as possible.

How do we do this?  Well, here’s the trick.  The function G is not arbitrary; if it were, we could make g(1) whatever we wanted.  It’s not hard to see that G is what’s called a positive definite function on R^2.  And moreover, if G is positive definite, there exists some f giving rise to it.  (Roughly speaking, this is the fact that a positive definite symmetric matrix has a square root.)  So we ask:  if G is a positive definite (radially symmetric) function on R^2, and g(0) = 1, how small can g(1) be?

And now there’s an old theorem of (Wisconsin’s own!) Isaac Schoenberg which helpfully classifies the positive definite functions on R^2; they are precisely the functions G(x) = g(|x|) where g is a mixture of scalings of the Bessel function $J_0$:

$g(r) = \int_0^\infty J_0(ur) A(u)$

for some everywhere nonnegative A(u).  (Actually it’s more correct to say that A is a distribution and we are integrating J_0(ur) against a non-decreasing measure.)

So g(1) can be no smaller than the minimum value of J_0 on [0,infty], and in fact can be exactly that small if you let A become narrowly supported around the minimum argument.  This is basically just taking g to be a rescaled version of J_0 which achieves its minimum at 1.  That minimum value is about -0.4, and so the Lovasz theta for any unit-distance subgraph on the plane is bounded above by a number that’s about 1 + 1/0.4 = 3.5.

To sum up:  I give you a set of points in the plane, I connect every pair that’s at distance 1, and I ask how you can embed that graph in a small hypersphere keeping all the distances 1.  And you say:  “Oh, I know what to do, just assign to each point a the radially symmetrized Bessel function J_0(|x-a|) on R^2, the embedding of your graph in the finite-dimensional space of functions spanned by those Bessel translates will do the trick!”

That is cool!

Remark: Oliveira’s thesis does this for Euclidean space of every dimension (it gets more complicated.)  And I think (using analysis I haven’t really tried to understand) he doesn’t just give an upper bound for the Lovasz number of the plane as I do in this post, he really computes that number on the nose.

Update:  DeCorte, Oliveira, and Vallentin just posted a relevant paper on the arXiv this morning!

Tagged , ,

## What is the Lovasz number of the plane?

There are lots of interesting invariants of a graph which bound its chromatic number!  Most famous is the Lovász number, which asks, roughly:  I attach vectors v_x to each vertex x such that v_x and v_y are orthogonal whenever x and y are adjacent, I try to stuff all those vectors into a small cone, the half-angle of the cone tells you the Lovász number, which is bigger and bigger as the smallest cone gets closer and closer to a hemisphere.

Here’s an equivalent formulation:  If G is a graph and V(G) its vertex set, I try to find a function f: V(G) -> R^d, for some d, such that

|f(x) – f(y)| = 1 whenever x and y are adjacent.

This is called a unit distance embedding, for obvious reasons.

The hypersphere number t(G) of the graph is the radius of the smallest sphere containing a unit distance embedding of G.  Computing t(G) is equivalent to computing the Lovász number, but let’s not worry about that now.  I want to generalize it a bit.  We say a finite sequence (t_1, t_2, t_3, … ,t_d) is big enough for G if there’s a unit-distance embedding of G contained in an ellipsoid with major radii t_1^{1/2}, t_2^{1/2}, .. t_d^{1/2}.  (We could also just consider infinite sequences with all but finitely many terms nonzero, that would be a little cleaner.)

Physically I think of it like this:  the graph is trying to fold itself into Euclidean space and fit into a small region, with the constraint that the edges are rigid and have to stay length 1.

Sometimes it can fold a lot!  Like if it’s bipartite.  Then the graph can totally fold itself down to a line segment of length 1, with all the black vertices going to one end and the white vertices going to the other.  And the big enough sequences are just those with some entry bigger than 1.

On the other hand, if G is a complete graph on k vertices, a unit-distance embedding has to be a simplex, so certainly anything with k of the t_i of size at least 1-1/k is big enough.   (Is that an if and only if?  To know this I’d have to know whether an ellipse containing an equilateral triangle can have a radius shorter than that of the circumcircle.)

Let’s face it, it’s confusing to think about ellipsoids circumscribing embedded graphs, so what about instead we define t(p,G) to be the minimum value of the L^p norm of (t_1, t_2, …) over ellipsoids enclosing a unit-distance embedding of G.

Then a graph has a unit-distance embedding in the plane iff t(0,G) <= 2.  And t(oo,G) is just the hypersphere number again, right?  If G has a k-clique then t(p,G) >= t(p,K_k) for any p, while if G has a k-coloring (i.e. a map to K_k) then t(p,G) <= t(p,K_k) for any n.  In particular, a regular k-simplex with unit edges fits into a sphere of squared radius 1-1/k, so t(oo,G) < 1-1/k.

So… what’s the relation between these invariants?  Is there a graph with t(0,G) = 2 and t(oo,G) > 4/5?  If so, there would be a non-5-colorable unit distance graph in the plane.  But I guess the relationship between these various “norms” feels interesting to me irrespective of any relation to plane-coloring.  What is the max of t(oo,G) with t(0,G)=2?

The intermediate t(p,G) all give functions which upper-bound clique number and lower-bound chromatic number; are any of them interesting?  Are any of them easily calculable, like the Lovász number?

Remarks:

1.  I called this post “What is the Lovász number of the plane?” but the question of “how big can t(oo,G) be if t(0,G)=2”? is more a question about finite subgraphs of the plane and their Lovász numbers.  Another way to ask “What is the Lovász number of the plane” would be to adopt the point of view that the Lovász number of a graph has to do with extremizers on the set of positive semidefinite matrices whose (i,j) entry is nonzero only when i and j are adjacent vertices or i=j.  So there must be some question one could ask about the space of positive semidefinite symmetric kernels K(x,y) on R^2  x R^2 which are supported on the locus ||x-y||=1 and the diagonal, which question would rightly be called “What is the Lovász number of the plane?” But I’m not sure what it is.
2. Having written this, I wonder whether it might be better, rather than thinking about enclosing ellipsoids of a set of points in R^d, just to think of the n points as an nxd matrix X and compute the singular values of X^T X, which would be kind of an “approximating ellipsoid” to the points.  Maybe later I’ll think about what that would measure.  Or you can!