## Smectic crystals, fingerprints, Klein bottles

Amazing colloquium this week by Randall Kamien, who talked about this paper with Chen and Alexander, this one with Liarte, Bierbaum, Mosna, and Sethna, and other stuff besides.

I’ve been thinking about his talk all weekend and I’m just going to write down a bit about what I learned.  In a liquid crystal, the molecules are like little rods; they have an orientation and nearby molecules want to have nearby orientations.  In a nematic crystal, that’s all that’s going on — the state of the crystal in some region B is given by a line field on B.   A smectic crystal has a little more to it — here, the rods are aligned into layers

(image via this handy guide to liquid crystal phases)

separated by — OK, I’m not totally clear on whether they’re separated by a sheet of something else or whether that’s just empty space.  Doesn’t matter.  The point is, this allows you to tell a really interesting topological story.  Let’s focus on a smectic crystal in a simply connected planar region B.   At every point of B, you have, locally, a structure that looks like a family of parallel lines in the plane, each pair of lines separated by a unit distance.  (The unit is the length of the molecule, I think.)

Alternatively, you can think of such a “local smectic structure” as a line in the plane, where we consider two lines equivalent if they are parallel and the distance between them is an integer.  What’s the moduli space M — the “ground state manifold” — of such structures?    Well, the line family has a direction, so you get a map from M to S^1.  The lines in a given direction are parametrized by a line, and the equivalence relation mods out by the action of a lattice, so the fiber of M -> S^1 is a circle; in fact, it’s not hard to see that this surface M is a Klein bottle.

Of course this map might be pretty simple.  If B is the whole plane, you can just choose a family of parallel lines on B, which corresponds to the constant map.  Or you can cover the plane with concentric circles; the common center doesn’t have a smectic structure, and is a defect, but you can map B = R^2 – 0 to M.  Homotopically, this just gives you a path in M, i.e. an element of pi_1(M), which is a semidirect product of Z by Z, with presentation

$\langle S,F: FSF^{-1} = S^{-1} \rangle$

The concentric circle smectic corresponds the map which sends the generator of pi_1(B) to F.

So already this gives you a nice topological invariant of a plane smectic with k defects; you get a map from pi_1(B), which is a free group on k generators, to pi_1(M).  Note also that there’s a natural notion of equivalence on these maps; you can “stir” the smectic, which is to say, you can apply a diffeomorphism of the punctured surface, which acts by precomposition on pi_1(B).  The action of (the connected components of) Diff(B) on Hom(pi_1(B), pi_1(M)) is my favorite thing; the Hurwitz action of a mapping class group on the space of covers of a Riemann surface!  In particular I think the goal expressed in Chen et al’s paper of “extending our work to the topology of such patterns on surfaces of nontrivial topology (rather than just the plane)” will certainly involve this story.  I think in this case the Hurwitz orbits are pretty big; i.e. if what you know is the local appearance of the defects (i.e. the image in pi_1(M) of the conjugacy class in pi_1(B) corresponding to the puncture) you should almost be able to reconstruct the homotopy type of the map (up to stirring.)  If I understood Randy correctly, those conjugacy classes are precisely what you can actually measure in an experiment.

There’s more, though — a lot more!  You can’t just choose a map from B to M and make a smectic out of it.  The layers won’t line up!  There’s a differential criterion.  This isn’t quite the way they express it, but I think it amounts to the following:  the tangent bundle of M has a natural line bundle L sitting inside it, consisting of those directions of motion that move a line parallel to itself.  I think you want to consider only those maps from B to M such that the induced map on tangent bundles TB -> TM takes image in L.  More concretely, in coordinates, I think this means the following:  if you think of the local smectic structure at p as the preimage of Z under some real-valued function f in the neighborhood of p, then f should satisfy

$(df/dx)^2 + (df/dy)^2 = 1.$

This restricts your maps a lot, and it accounts for all kinds of remarkable behavior.  For one thing, it forbids certain conjugacy classes in pi_1(M) from appearing as local monodromy; i.e. the set of possible defect types is strictly smaller than the set of conjugacy classes in pi_1(M).  Moreover, it forbids certain kinds of defects from colliding and coalescing — for algebraic geometers, this naturally makes you feel like there’s a question about boundaries of Hurwitz spaces floating around.

Best of all, the differential equation forces the appearance of families of parallel ellipses, involute spirals, and other plane curves of an 18th century flavor.  The cyclides of Dupin put in an appearance.  Not just in the abstract — in actual liquid crystals!  There are pictures!  This is great stuff.

Update:  Wait a minute — I forgot to say anything about fingerprints!  Maybe because I don’t have anything to say at the moment.  Except that the lines of a fingerprint are formally a lot like the lines of a smectic crystal, the defects can be analyzed in roughly the same way, etc.  Whether the diffeomorphism type of a fingerprint is an interesting forensic invariant I don’t rightly know.  I’ll bet whoever made my iPhone home button knows, though.

## New bounds on curve tangencies and orthogonalities (with Solymosi and Zahl)

New paper up on the arXiv, with Jozsef Solymosi and Josh Zahl.  Suppose you have n plane curves of bounded degree.  There ought to be about n^2 intersections between them.  But there are intersections and there are intersections!  Generically, an intersection between two curves is a node.  But maybe the curves are mutually tangent at a point — that’s a more intense kind of singularity called a tacnode.  You might think, well, OK, a tacnode is just some singularity of bounded multiplicity, so maybe there could still be a constant multiple of n^2 mutual tangencies.

No!  In fact, we show there are O(n^{3/2}).  (Megyesi and Szabo had previously given an upper bound of the form n^{2-delta} in the case where the curves are all conics.)

Is n^{3/2} best possible?  Good question.  The best known lower bound is given by a configuration of n circles with about n^{4/3} mutual tangencies.

Here’s the main idea.  If a curve C starts life in A^2, you can lift it to a curve C’ in A^3 by sending each point (x,y) to (x,y,z) where z is the slope of C at (x,y); of course, if multiple branches of the curve go through (x,y), you are going to have multiple points in C’ over (x,y).  So C’ is isomorphic to C at the smooth points of C, but something’s happening at the singularities of C; basically, you’ve blown up!  And when you blow up a tacnode, you get a regular node — the two branches of C through (x,y) have the same slope there, so they remain in contact even in C’.

Now you have a bunch of bounded degree curves in A^3 which have an unexpectedly large amount of intersection; at this point you’re right in the mainstream of incidence geometry, where incidences between points and curves in 3-space are exactly the kind of thing people are now pretty good at bounding.  And bound them we do.

Interesting to let one’s mind wander over this stuff.  Say you have n curves of bounded degree.  So yes, there are roughly n^2 intersection points — generically, these will be distinct nodes, but you can ask how non-generic can the intersection be?  You have a partition of const*n^2 coming from the multiplicity of intersection points, and you can ask what that partition is allowed to look like.  For instance, how much of the “mass” can come  from points where the multiplicity of intersection is at least r?  Things like that.

## Dissecting squares into equal-area triangles: idle questions

Love this post from Matt Baker in which he explains the tropical / 2-adic proof (in fact the only proof!) that you can’t dissect a square into an odd number of triangles of equal area.  In fact, his argument proves more, I think — you can’t dissect a square into triangles whose areas are all rational numbers with odd denominator!

• The space of quadrilaterals in R^2, up to the action of affine linear transformations, is basically just R^2, right?  Because you can move three vertices to (0,0), (0,1), (1,0) and then you’re basically out of linear transformations.   And the property “can be decomposed into n triangles of equal area” is invariant under those transformations.  OK, so — for which choices of the “fourth vertex” do you get a quadrilateral that has a decomposition into an odd number of equal-area triangles? (I think once you’re not a parallelogram you lose the easy decomposition into 2 equal area triangles, so I suppose generically maybe there’s NO equal-area decomposition?)  When do you have a decomposition into triangles whose area has odd denominator?
• What if you replace the square with the torus R^2 / Z^2; for which n can you decompose the torus into equal-area triangles?  What about a Riemann surface with constant negative curvature?  (Now a “triangle” is understood to be a geodesic triangle.)  If I have this right, there are plenty of examples of such surfaces with equal-area triangulations — for instance, Voight gives lots of examples of Shimura curves corresponding to cocompact arithmetic subgroups which are finite index in triangle groups; I think that lets you decompose the Riemann surface into a union of fundamental domains each of which are geodesic triangles of the same area.

## Math bracket 2016

It’s that time again — March Math Madness, where we fill out an NCAA men’s tournament bracket with the best math department winning every game.  As always, this bracket was filled out by a highly trained team consisting of myself and a group of procrastinating grad students, making decisions by voice vote, and if you disapprove of one of our choices, I’m sure it’s somebody else’s fault.  This is Berkeley’s first championship after falling to Harvard in 2012; meanwhile, Michigan sees its second final in three years but falls short again…

Update: In the 34th percentile at ESPN after one day of play — thanks, Yale!

Update:  Down to the 5th percentile and only Duke and UVa are left out of my final 8 picks.  Not gonna be the math bracket’s finest year.

Tagged , ,

## New SIAM journal on applied algebra

As you know I’m a fan of “applied pure math.”  You can be an applied mathematician and not do PDEs or numerical analysis now!  And to reflect this new reality, there’s a new journal, SIAM Journal on Applied Algebra and Geometry (SIAGA).  They’re accepting submissions starting March 23; I actually have a little something that might be finished by then that would be perfect!

There are a bunch of cool pictures on the SIAGA flyer:  UC-Berkeley grad student Anna Seigal is explaining them on her blog, one by one.  The first entry depicts the algebraic variety you encounter when you try to find a point minimizing the summed distances to k other points in the plane.  Handsome!

Tagged ,

## Metric chromatic numbers and Lovasz numbers

In the first post of this series I asked whether there was a way to see the Lovasz number of a graph as a chromatic number.  Yes!  I learned it from these notes, written by Big L himself.

Let M be a metric space, and let’s assume for simplicity that M has a transitive group of isometries.  Now write r_M(n) for the radius of the smallest ball containing n points whose pairwise distances are all at least 1.  (So this function is controlling how sphere-packing works in M.)

Let Γ be a graph.  By an M-coloring of Γ we now mean a map from v(Γ) to M such that adjacent vertices are at distance at least 1.  Write χ_Γ(M) for the radius of the smallest disc containing an M-coloring of Γ.  Then we can think of r^{-1}(χ_Γ(M)) as a kind of “M-chromatic number of Γ.”  Scare quotes are because r isn’t necessarily going to be an analytic function or anything; if I wanted to say something literally correct I guess I would say the smallest integer n such that r_M(n) >= χ_Γ(M).

The M-chromatic number is less than the usual chromatic number χ_Γ;  more precisely,

χ_Γ(M) <= r_M(χ_Γ)

Easy:  if there’s an n-coloring of Γ, just compose it with the map from [n] to M of radius r_M(n).  Similary, if ω_Γ is the clique number of Γ, we have

r_M(ω_Γ) <= χ_Γ(M)

because a k-clique can’t be embedded in a ball of radius smaller than r_M(k).

So this M-chromatic number gives a lower bound for the chromatic number and an upper bound for the clique number, just as the Lovasz number does, and just as the fractional chromatic number does.

Example 1:  Lovasz number.  Let M be the sphere in infinite-dimensional Euclidean space.  (Or |Γ|-dimensional Euclidean space, doesn’t matter.)  For our metric use (1/sqrt(2)) Euclidean distance, so that orthogonal vectors are at distance 1 from each other.  If n points are required at pairwise distance at least 1, the closest way to pack them is to make them orthonormal (I didn’t check this, surely easy) and in this case they sit in a ball of radius 1-sqrt(1/2n) around their center of mass.  So r_M(n) = 1 – sqrt(1/2n).  Define t(Γ) to be the real number such that

$1 - \sqrt{1/2t(\Gamma)} = \chi_\Gamma(M)$.

Now I was going to say that t(Γ) is the Lovasz theta number of Γ, but that’s not exactly the definition; that would be the definition if I required the embedding to send adjacent vertices to points at distance exactly 1.  The answer to this MO question suggests that an example of Schrijver might actually separate these invariants, but I haven’t checked.

So I guess let’s say t(Γ) is a “Lovasz-like number” which is between the clique number and the chromatic number.  And like the Lovasz number, but unlike clique and chromatic numbers, it’s super-easy to compute.  An embedding of v(Γ) in the sphere, up to rotation, is specified by the pairwise distance matrix, which can be an arbitrary postive definite symmetric nxn matrix A with 1’s on the diagonal.  Each edge of Γ now gives an inequality a_{ij} > 1.  When you’re optimizing over a space cut out by linear inequalities in the space of psd matrices, you’re just doing semidefinite programming.  (I am punting a little about how to optimize “radius” but hopefully maximum distance of any vector from center of mass is good enough?)

(Note:  you know what, I’ll bet you can take an embedding like this, subtract a small multiple of the center of mass from all the vectors, and get an embedding of v(Γ) in n-space with all angles between adjacent vectors slightly obtuse, and probably this ends up being exactly the same thing as the vector chromatic number defined in the paper I linked to earlier.)

Where is example 2?  It was supposed to be about the fractional chromatic number but then I realized the way I was setting this up wasn’t correct.  The idea is to let M_b be the space of infinite bit strings with exactly b 1’s and use (1/2b) Hamming distance, so that the distance-1 requirement becomes a requirement that two b-element subsets be disjoint.  But I don’t think this quite fits into the framework I adopted at the top of the post.  I’ll circle back to this if I end up having what to say.

## Coloring graphs with polynomials

More chromatic hoonja-doonja!

Suppose you have a bunch of tokens of different colors and weights.  X_1 colors of weight 1 tokens, X_2 colors of weight 2 tokens, etc.

Let Γ be a graph.  A (weighted) b-coloring of Γ is an assignment to each vertex of a set of tokens with total weight b, such that adjacent vertices have no tokens in common.  Let χ_Γ(X_1, … X_b) be the number of b-colorings of Γ.  I made up this definition but I assume it’s in the literature somewhere.

First of all, χ_Γ(X_1, … X_b) is a polynomial.

Is this multivariable “chromatic polynomial” of any interest?  Well, here’s one place it comes up.  By a degree-b polynomial coloring of Γ we mean an assignment of a monic squarefree degree d polynomial in R[x] to each vertex of Γ, so that adjacent vertices are labeled with coprime polynomials.   Now let U_b(Γ) be the manifold parametrizing degree-b colorings of Γ.

Then the Euler characteristic of U_b(Γ) is χ_Γ(-1,1,0,…0).

I worked this out via the same kind of Lefschetz computation as in the previous post, but once you get the answer, you can actually derive this as a corollary of Stanley’s theorem.  And it was presumably already known.

By the way:  let V_n be the free vector space spanned by the b-colorings of Γ where all the tokens have weight 1; these are called fractional colorings sometimes.  Then S_n acts on V_n by permutation of colors.  The character of this action is a class function on S_n.  More precisely, it is

χ_Γ(X_1, … X_b)

where X_i is now interpreted as a class function on S_n, sending a permutation to the number of i-cycles in its cycle decomposition.  Of course the real thing going on behind the scenes is that the V_n form a finitely generated FI-module.

## Counting acyclic orientations with topology

Still thinking about chromatic polynomials.   Recall: if Γ is a graph, the chromatic polynomial χ_Γ(n) is the number of ways to color the vertices of Γ in which no two adjacent vertices have the same color.

Fact:  χ_Γ(-1) is the number of acyclic orientations of Γ.

This is a theorem of Richard Stanley from 1973.

Here’s a sketch of a weird proof of that fact, which I think can be made into an actual weird proof.  Let U be the hyperplane complement

$\mathbf{A}^|\Gamma| - \bigcup_{ij \in e(\Gamma)} (z_i = z_j)$

Note that |U(F_q)| is just the number of colorings of Γ by elements of F_q; that is,  χ_Γ(q).  More importantly, the Poincare polynomial of the manifold U(C) is (up to powers of -1 and t) χ_Γ(-1/t).  The reason |U(F_q)| is  χ_Γ(q) is that Frobenius acts on H^i(U) by q^{-i}.  (OK, I switched to etale cohomology but for hyperplane complements everything’s fine.)  So what should  χ_Γ(-1) mean?  Well, the Lefschetz trace formula suggests you look for an operator on U(C) which acts as -1 on the H^1, whence as (-1)^i on the H^i.  Hey, I can think of one — complex conjugation!  Call that c.

Then Lefchetz says χ_Γ(-1) should be the number of fixed points of c, perhaps counted with some index.  But careful — the fixed point locus of c isn’t a bunch of isolated points, as it would be for a generic diffeo; it’s U(R), which has positive dimension!  But that’s OK; in cases like this we can just replace cardinality with Euler characteristic.  (This is the part that’s folkloric and sketchy.)  So

χ(U(R)) = χ_Γ(-1)

at least up to sign.  But U(R) is just a real hyperplane complement, which means all its components are contractible, so the Euler characteristic is just the number of components.  What’s more:  if (x_1, … x_|Γ|) is a point of U(R), then x_i – x_j is nonzero for every edge ij; that means that the sign of x_i – x_j is constant on every component of U(R).  That sign is equivalent to an orientation of the edge!  And this orientation is obviously acyclic.  Furthermore, every acyclic orientation can evidently be realized by a point of U(R).

To sum up:  acyclic orientations are in bijection with the connected components of U(R), which by Lefschetz are χ_Γ(-1) in number.

## Metric chromatic numbers

Idle thought.  Let G be a (loopless) graph, let M be a metric space, and let b be a nonnegative real number.  Then let Conf(G,M,b) be the space of maps from the vertices of the graph to M such that no two adjacent vertices are within b of each other.

When b=0 and G is the complete graph K_n, this is the usual ordered configuration space of n points on M.  When G is the empty graph on n vertices, it’s just M^n.  When M is a set of N points with the discrete metric, Conf(G,M,b) is the same thing for every b;  a set of points whose cardinality is χ_G(N), the chromatic polynomial of G evaluated at N.  When M is a box, Conf(G,M,b) is the “discs in a box” space I blogged about here.  If M is (Z/2Z)^k with Hamming distance, you are asking about how many ways you can supply G with k 2-colorings so that no edge is monochromatic in more than k-b-1 of them.

Update:  Ian Agol links in the comments to this paper about Conf(G,M,0) by Eastwood and Huggett; the paper points out that the Euler characteristic of Conf(G,M,0) computes χ_G(N) whenever M has Euler characteristic N; so M being N points works, but so does M = CP^{N-1}, and that’s the case they study.

When b=0 and G is the complex plane, Conf(G,C,0) is the complement of the graphic arrangement of G; its Poincare polynomial is  t^-{|G|} χ_G(-1/t).  Lots of graphs have the same chromatic polynomial (e.g. all trees do) but do they have homeomorphic Conf(G,C,0)?

This is fun to think about!  Is Conf(G,M,0), for various manifolds other than discrete sets of points, an interesting invariant of a graph?

You can think of

vol(Conf(G,M,b)) vol(M)^{-n}

as a sort of analogue of the chromatic polynomial, especially when b is small; when M = C, for instance, Conf(G,M,b) is just the complement of tubular neighborhoods around the hyperplanes in the graphical arrangement, and its volume should be roughly b^|G|χ_G(1/b) I think.  When b gets big, this function deviates from the chromatic polynomial, and in particular you can ask when it hits 0.

In other words:  you could define an M-chromatic number χ(G,M) to be the smallest B such that Conf(G,M,1/B) is nonempty.  When M is a circle S^1 with circumference 1, you can check that χ(G,M) is at least the clique number of G and at most the chromatic number.  If G is a (2n+1)-cycle, the clique number is 2, the chromatic number is 3, and the S^1-chromatic number is 2+1/n, if I did this right.  Does it have anything to do with the Lovasz number, which is also wedged between clique number and chromatic number?  Relevant here:  the vector chromatic number, which is determined by χ(G,S^{v(G)-1}), and which is in fact a lower bound for the Lovasz number.

## Ranking mathematicians by hinge loss

As I mentioned, I’m reading Ph.D. admission files.  Each file is read by two committee members and thus each file has two numerical scores.

How to put all this information together into a preliminary ranking?

The traditional way is to assign to each applicant their mean score.  But there’s a problem: different raters have different scales.  My 7 might be your 5.

You could just normalize the scores by subtracting that rater’s overall mean.  But that’s problematic too.  What if one rater actually happens to have looked at stronger files?  Or even if not:  what if the relation between rater A’s scale and rater B’s scale isn’t linear?  Maybe, for instance, rater A gives everyone she doesn’t think should get in a 0, while rater A uses a range of low scores to express the same opinion, depending on just how unsuitable the candidate seems.

Here’s what I did last year.  If (r,a,a’) is a triple with r is a rater and a and a’ are two applicants, such that r rated a higher than a’, you can think of that as a judgment that a is more admittable than a’.  And you can put all those judgments from all the raters in a big bag, and then see if you can find a ranking of the applicants (or, if you like, a real-valued function f on the applicants) such that, for every judgment a > a’, we have f(a) > f(a’).

Of course, this might not be possible — two raters might disagree!  Or there might be more complicated incompatibilities generated by multiple raters.  Still, you can ask:  what if I tried to minimize the number of “mistakes”, i.e. the number of judgments in your bag that your choice of ranking contradicts?

Well, you can ask that, but you may not get an answer, because that’s a highly non-convex minimization problem, and is as far as we know completely intractable.

But here’s a way out, or at least a way part of the way out — we can use a convex relaxation.  Set it up this way.  Let V be the space of real-valued functions on applicants.  For each judgment j, let mistake_j(f) be the step function

mistake_j(f) = 1 if f(a) < f(a’) + 1

mistake_j(f) = 0 if f(a) >= f(a’) + 1

Then “minimize total number of mistakes” is the problem of minimizing

M = sum_j mistake_j(f)

over V.  And M is terribly nonconvex.  If you try to gradient-descend (e.g. start with a random ranking and then switch two adjacent applicants whenever doing so reduces the total number of mistakes) you are likely to get caught in a local minimum that’s far from optimal.  (Or at least that can happen; whether this typically actually happens in practice, I haven’t checked!)

So here’s the move:  replace mistake_j(f) with a function that’s “close enough,” but is convex.  It acts as a sort of tractable proxy for the optimization you’re actually after.  The customary choice here is the hinge loss:

hinge_j(f) = min(0, f(a)-f(a’) -1).

Then H := sum_j hinge_j(f) is a convex function on f, which you can easily minimize in Matlab or python.  If you can actually find an f with H(f) = 0, you’ve found a ranking which agrees with every judgment in your bag.  Usually you can’t, but that’s OK!  You’ve very quickly found a function H which does a decent job aggregating the committee scores. and which you can use as your starting point.

Now here’s a paper by Nihal Shah and Martin Wainwright commenter Dustin Mixon linked in my last ranking post.  It suggests doing something much simpler:  using a linear function as a proxy for mistake_j.  What this amounts to is:  score each applicant by the number of times they were placed above another applicant.  Should I be doing this instead?  My first instinct is no.  It looks like Shah and Wainwright assume that each pair of applicants is equally likely to be compared; I think I don’t want to assume that, and I think (but correct me if I’m wrong!) the optimality they get may not be robust to that?

Anyway, all thoughts on this question — or suggestions as to something totally different I could be doing — welcome, of course.