## Should you be surprised by the diameter of the nxnxn Rubik’s group?

A press release from MIT reports on a new theorem of Erik Demaine, Martin Demaine, Sarah Eisenstat, Anna Lubiw, and Andrew Winslow:  the group of transformations of the n x n x n Rubik’s cube has diameter on order n^2 / log n in the standard generators.  The press release quotes (Erik) Demaine as saying

“That that’s the answer, and not N2, is a surprising thing,”

That they were able to prove this is surprising, and impressive, indeed!  Upper bounds on diameters of Cayley graphs are hard; you use a big spectral hammer if you have one at hand, and otherwise you really have to get dirty and explain algorithmically how you’re going to get all the group elements from short words.  And in general, the best you can hope for is an asymptotic result like the one obtained by Demaine et al; the precise diameter of the 3x3x3 Rubik’s group was only just determined to be 20 with the help of 35 CPU-years donated by Google.

But the truth of the theorem is much less surprising, in my opinion.  Let’s start with the lower bound, which is easy, as Demaine recounts.

“In the first hour, we saw that it had to be at least N2/log N,” Demaine says.

Why was this so fast?  Because the number of generators is just 6 N; for each of the 3 coordinate axes, there are N clockwise quarter-turns perpendicular to the axis and N counterclockwise quarter-turns.  So the total number of words of length k in those generators is on order N^k; for this to cover the whole group, whose size is known to be exponential in N^2, you need k to be at least N^2 / log N.

If all those words were distinct, then the diameter of the group would be N^2/log N.  But they’re not distinct!  The Rubik’s group has relations; it is not free. Far from it.

But how far from it, is the question.

To measure this, we write W(k) for the number of distinct words of length k, or “the volume of the word ball.”  If the group were free on the given generators, we’d have W(k) = (6N)^k, exponential in k.  If, on the other hand, the group were free abelian on the given generators, W(k) would be on order of k^(6N), polynomial in k.  Commutativity crushes words together a lot.  And notice that a lot of the Rubik generators commute with each other:  whenever two twists are perpendicular to the same axis, they commute.  Does that make W(k) grow a lot more slowly?

The Cartier-Foata theorem is a beautiful identity that addresses this question.  It provides an exact formula for the number of length-k words in a “partially commutative monoid,” where some pairs of generators commute, and others don’t.  Namely:  if we construct a generating function

$P(t) = \sum_{k=0}^\infty W(k) t^k,$

then

$P(t) = (\sum_C (-t)^{|C|})^{-1}$

where C ranges over all commutative cliques, i.e. mutually commutative subsets of the generators.  When there are no commutation relations, the commutative cliques are just the subsets of size 0 and 1, so you get P(t) = 1/(1-mt) where m is the number of generators, and W(k) grows like m^k.  When all generators commute, every subset is a commutative clique, and you get P(t) = (1-t)^(-m), whose coefficients have polynomial growth.  Julianna Tymoczko and I used the Cartier-Foata formula to give a lower bound for the diameter of unipotent subgroups of Lie groups of finite type.  In our case, the number of generators grew with the rank n of the group, but the generators commute to such a great extent that the diameter was on order log |G| instead of log |G| / n.

The Rubik’s group is not quite that commutative.  The commutative cliques are just the subsets of generators which are perpendicular to the same axis, so there are

$3 {2N \choose a}$

commutative cliques of size a for all positive a.  Cartier-Foata tells us that the monoid whose only relations are these commutations has

$P(t) = (3(1-t)^{2N} - 2)^{-1}$

whose smallest pole, if I did this right, is at around (log 3/2)(1/2N).  So the t^k coefficient, which is an upper bound for the W(k) of the Rubik’s group, is exponential on order of (2/log(3/2)N)^k.  In particular, this poses no obstacle to filling up the whole group with words of length O(N^2 / log N).

Now that doesn’t mean you can actually do it!  W(k) might be much smaller.  But in my (limited) experience, commutativity relations really do a lot of the work.  (For instance, in my paper with Tymoczko, we give an algorithmic way of writing group elements as short words that shows that the Cartier-Foata bound is correct up to a constant, just as in the Rubik’s case.  Also as in the Rubik’s case, that upper bound is the hard part.)  Of course it’s easy to construct examples where the Cartier-Foata estimate is way off  (e.g. a free k-nilpotent group on N generators.)  But if you had made me guess what the asymptotic behavior of the Rubik’s group is, I like to think I would have gotten it right.

## In which I publish my first paper

The first one I wrote, that is.  It happened like this:  my undergraduate thesis advisor was Persi Diaconis, and in 1993, my senior year, Diaconis was really peeved about the proof via Selberg that every element of SL_2(F_p) could be expressed as a word of length at most C log p in the standard unipotent generators.  (See Emmanuel’s comment on Terry’s blog for useful references.)  Diaconis felt it was a combinatorial problem and it should be solvable by purely combinatorial means, and that a hard-working undergraduate who was good at Putnam problems, like me, ought to be able to do it.

That turned out not to be the case.

So Persi gave me another thesis problem; he asked if I could get good bounds on the diameter of the unipotent subgroup U of SL_n(F_p), with its standard generating set id + e_{i,i+1}.  When n = 2, this is easy; the unipotent group is just Z/pZ and its diameter is about p.  The question is:  what happens asymptotically when n and p are allowed to grow?

It’s not possible any longer for the diameter of U to be on order log |U|, as is the case for SL_2(F_p); the abelianization of U looks like (Z/pZ)^{n-1}, which already has diameter on order of np.  Unless n is much larger than p, this swamps log |U|.

But it turns out this is in some sense the only obstacle:  one can prove that diam(U) is bounded above and below by constant multiples of

np + log |U|.

(My memory is that I conceived the key step of the argument during a boring Advocate meeting.)

Much later, when I was a new postdoc at Princeton, I talked about this problem with Julianna Tymoczko, then a graduate student working with Bob McPherson.  Julianna very quickly saw how to make the argument much more conceptual and general, and in particular how to extend it to all the classical groups.  So we decided to write it up as a joint paper.  That was probably 2002.  Then we got around to writing the paper and submitting it.  That was 2005.  It was accepted in 2007.  And now here it is!  That’s the abstract; if you’re at a computer that doesn’t subscribe to Forum Math, here’s the arXiv version.  17 years from first version of the theorem to publication!

Update: Harald Helfgott politely comment-hints at something I should have put in the original post, which is that nowadays, thanks to him, there is a combinatorial proof that the diameter of SL_2(F_p) is on order log p!  The subject of uniform bounds for word growth and spectral gaps in finite groups of Lie type is currently moving very quickly.  I won’t try to summarize the state of the art, but you can expect in the medium-term to hear something about an interesting application of Harald’s work to arithmetic geometry.

## The moduli space of chords: Dmitri Tymoczko on “Geometry and Music”, Friday 7 Mar, 2:30pm

The next installment of the “Math And…” seminar is coming next week:

7 Mar 2008, 2:30 PM, Birge 145: Dmitri Tymoczko (Princeton University, music)

Title: “Geometry and Music.”

In my talk, I will explain how to translate the basic concepts of music theory into the language of contemporary geometry. I will show that musicians commonly abstract away from five types of musical transformations, the “OPTIC transformations,” to form equivalence classes of musical objects. Examples include “chord,” “chord type,” “chord progression,” “voice leading,” and “pitch class.” These equivalence classes can be represented as points in a family of singular quotient spaces, or orbifolds: for example, two-note chords live on a Mobius strip whose boundary acts like a mirror, while four-note chord-types live on a cone over the real projective plane. Understanding the structure of these spaces can help us to understand general constraints on musical style, as well as specific pieces. The talk will be accessible to non-musicians, and will exploit interactive 3D computer models that allow us to see and hear music simultaneously.

The interactive 3D computer models are made by Dmitri’s program Chord Geometries, which you can download from his website. Or you can just watch the sample movies, and see what the opening of “Smoke on the Water” looks like on a Mobius strip. Dmitri comes by his mathematical know-how familially; his younger sister is the algebraic geometer Julianna Tymoczko (with whom I wrote this paper about diameters of finite groups) and his father was the philosopher of mathematics Thomas Tymoczko.

I’ve booked a big room, and the talk is open to the public, so if the abstract sounds interesting to you, please come!

On the other hand, perhaps the abstract puts you off by means of words like “real projective plane.” Well, it’s written for mathematicians; maybe it’s worth explaining in more down-to-earth language about what a guy like Dmitri means by “the space of chords,” or, as an algebraic geometer like me might put it, the moduli space of chords. Explanation below the fold: it’s essentially a brief version of an article called “The Idea of a Moduli Space” that I wrote for Imagine magazine when I was a grad student.