## A little more about random configurations

In my previous post about the configuration space of hard discs in a box, I neglected to say anything about the main point of Persi’s article!  It’s the following — even though  you don’t know anything about the topology of the space of configurations, you can still do an excellent job of drawing a configuration at random from the natural distribution using Monte Carlo techniques.  And if you’re a physicist trying to model the behavior of a gas or a fluid, you might be more interested in what a random configuration looks like than whether the space of configurations is connected — it might not be so relevant that you can get from something that looks like tight packing 1 to something that looks like tight packing 2, if the probability of doing so is vanishingly small.  Or to put it another way — if the space looks like a bunch of big blobs connected by extremely narrow paths, then from the point of view of physics it might as well be disconnected.

Still, as a topologist, you might ask:  if I can do a good random sample of points from a mystery manifold M, can I compute topological invariants of M with high confidence?  Can you guess whether M is connected?  More generally, can you guess the homology groups of M?

You might think of this as a massive geometric generalization of the age-old problem of “cluster analysis.”  Let’s say you have a bunch of people, and for each person you measure N variables — let’s say, height, weight, and shoe size.  So you have a bunch of points in R^N — in this case R^3.

Maybe you hope that these points are well-modeled as a sample from some multivariate normal distribution.  But under some circumstances, this is a really bad hope!  For instance, if your sample isn’t segregated by gender, you’re going to see two big clusters — one cluster of women where the mean height is around 5’6″, one cluster of men where the mean height is around 5’9″.  You’re not really sampling from a normal distribution — you might be sampling from a superimposition of two different normal distributions with different centers.  Or, alternately, you could think of yourself as sampling points from a manifold in R^3 consisting of just two points — “the ideal woman and the ideal man” — where your measurements are subject to some error that’s distributed normally.

My impression is that statisticians are pretty good at distinguishing between a normal distribution and a superimposition of some small finite set of normal distributions.  But I think it’s much harder to look at a giant cloud of points in R^100 and say “aha — this is actually a random sample from a normal distribution centered on the union of a surface of genus 2 sitting over here, and these ten disjoint circles sitting over there.”

If you were wondering about this while reading the Diaconis article in the Bulletin, you’d be in luck, because flipping forward a few pages you’d get to Gunnar Carlsson’s long survey article on precisely this genre of problem!  More on “topological statistics” once I’ve read Carlsson’s article, but let me point out now that if you’re a young mathematician interested in these matters you might consider going to the CBMS summer school this August, centered on a lecture series by Ghrist.

## Hard discs, configuration spaces, kissing spheres, the 15 puzzle

Persi Diaconis has a lovely article in this month’s Bulletin of the AMS about the “Markov chain Monte Carlo revolution.”  I especially liked the discussion of the “hard disk model” for fluids, which was new to me.  Suppose X is a region in the plane and r a small positive number; then we can denote by Conf(X,n,r) the subspace of X^n consisting of n-tuples (x_1, … x_n) such that the discs of radius r centered at the x_i don’t intersect.  A natural statistic on Conf(X,n,r) is the “packing density” $\rho = n \pi r^2 / \mbox{area}(X)$

the proportion of the area of X which is taken up by the discs.  You can imagine Conf(X,n,r) as a model for the space of states of a substance whose molecules are represented by the discs — the molecules are forbidden to overlap, but are otherwise allowed to move freely.  When $\rho$ is small, the space looks a lot like Conf(X,n,0), the configuration space of n distinct points in R.  But when $\rho$ is large (more than about .71) the system undergoes a “Kirkwood phase transition” and the balls are constrained to look something like a lattice packing.  You might think of the former case as modeling a liquid, and the latter as modeling a solid.  See Figure 5 of Persi’s paper for a nice illustration of this phenomenon.

Apparently, we don’t know very much about the topology of Conf(X,n,r).  Of course, for very large r the configuration space is empty.  When r = 0 it is the usual configuration space of n distinct points on X.  You might imagine carrying out a kind of Morse-theoretic decomposition of Conf(X,n,0) as follows:  for each n-tuple (x_1, … x_n) of distinct points on X, write $f(x_1, \ldots, x_n) = \max_{ij} d(x_i,x_j)^{-1}$

where d(x,x’) is the distance between points x and x’ of X.  Then f seems a bit like a Morse function, and you can imagine building up Conf(X,n,0) by starting with a $0$-cell at each local minimum of f, then attaching 1-cells for each critical point with one negative eigenvalue, and so on.  Of course, f is badly non-smooth!  But you can imagine, I guess, replacing the maximum by an $L_p$ for suitably large p and seeing what happens…

The most primitive topological question you can ask is whether Conf(X,n,r) is connected — that is, given two configurations of n hard r-discs in X, is there always room to move from one configuration to the other?

One special case of this question is, to my eye, particularly charming: given a configuration of discs, can you permute the discs arbitrarily?  When the discs are packed almost as tightly as possible, you might expect a negative answer; but in practice, it seems that a very small amount of wiggle room is enough to re-order the discs any way you want.

Example 1:  Suppose X is a unit sphere and r is pi/6 radians.  Then a configuration of n disjoint r-discs on X is equivalent to a configuration of n disjoint unit spheres tangent to X.  The “kissing spheres problem,” which goes back at least to Newton, asks:  what is the maximal number of disjoint unit spheres that can be tangent to X?  The answer is 12; for instance, one can arrange these spheres at the vertices of a regular icosahedron.  This leaves a little room for the spheres to roll around; and it turns out that this small leeway is enough to allow one to permute the spheres by an arbitrary element of S_12!  John Baez has a nice discussion of this in a very early episode of This Week’s Finds.

Example 2: Let X be a unit square, and take r = 1/8 and n = 15; then you can cut X into 16 squares of sidelength 1/2, and place a disc at the center of all the squares but the one in the lower left corner.  (In the interest of writing this post quickly, I’ll leave you to draw your own picture of this.)  Now this packing of discs is pretty tight — at the outset, only two of the discs have room to move.  But it turns out that here, too, one can permute the discs in an almost arbitrary way; you can get any permutation in the alternating group A_15.  In fact, this problem is nothing but a continuous version of the Sam Loyd “15” puzzle.

If you wanted to get a general sense of what was going on here, you might discretize the problem in the following way:  suppose X is a graph with N vertices, n of which are covered by stones.  You are allowed to move any stone to an adjacent empty vertex.  Can you permute the stones in any way you like?  If N = n, the answer is clearly no; there are no allowable rooms.  But if you allow any wiggle room at all — that is, if you let n = N-1 — the answer, somewhat suprisingly to me, is essentially yes.  This is a theorem of Wilson (“Graph Puzzles, Homotopy, the Alternating Group”) from 1974:  if a graph X cannot be made disconnected by the removal of a singular vertex, and is not an N-cycle, then the N-1 stones can be permuted by any permutation in the alternating group, with one exception when N = 7.

What is this strange exceptional 7-vertex graph?  See if you can figure it out!  Or look here.