## Khot,Minzer,Safra on approximate sections of sheaves

Subhash Khot is giving a talk at Current Developments in Math this year and his talk has the intriguing phrase “Grassmann graph” in it so I thought I’d look up what it is and what he and his collaborators prove about it, and indeed it’s interesting! I’m just going to jot down something I learned from “Pseudorandom Sets in Grassmann Graph have Near-Perfect Expansion,” by Khot, Dor Minzer, and Muli Safra, in a way that makes it sound like something algebraic geometers might be interested in, which, indeed, I think they might be!

Suppose you have a sheaf F on a space, and the space has a covering U_1, .. U_N. The sheaf axiom says that if we have a family of sections s_i of F(U_i) such that s_i and s_j agree on $U_i \cap U_j$ for all i,j, then there is actually a global section s in F(X) which restricts to each s_i.

What if we only have an approximate section? That is: what if we have a family of s_i such that: if I select i and j uniformly at random, the probability that s_i and s_j agree on $U_i \cap U_j$ is bounded below by some p > 0. Call such a family a “p-section.” (You should take the view that this is really a family of problems with X changing and N growing, so that when I say p > 0 the content is that p is bounded away from some 0 uniformly in X,N.)

The question is then: Is an approximate section approximately a section?

(This is meant to recall the principle from additive number theory that an approximate subgroup is approximately a subgroup, as in e.g. Freiman-Rusza.)

That is: if s_1, .. s_N from a p-section, is there some actual section s in F(X) such that, for i chosen uniformly at random,

$\mathbf{Pr}(s | U_i) = s_i > p' > 0$

for some p’ depending only on p?

The case which turns out to be relevant to complexity theory is the Grassmann graph, which we can describe as follows: X is a k-dimensional vector space over F_2 and the U_i are the l-dimensional vector spaces for some integer l. But we do something slightly weird (which is what makes it the Grassmann graph, not the Grassmann simplicial complex) and declare that the only nonempty intersections are those where $U_i \cap U_j$ has dimension l-1. The sheaf is the one whose sections on U_i are the linear functions from U_i to F_2.

Speculation 1.7 in the linked paper is that an approximate section is approximately a section. This turns out not to be true! Because there are large sets of U_i whose intersection with the rest of X is smaller than you might expect. This makes sense: if X is a space which is connected but which is “almost a disjoint union of X_1 and X_2,” i.e. $X_1 \cup X_2 = X$ and $\latex X_1 \cap X_2$ involves very few of the U_i, then by choosing a section of F(X_1) and a section of F(X_2) independently you can get an approximate section which is unlikely to be approximated by any actual global section.

But the good news is that, in the case at hand, that ends up being the only problem. Khot-Minzer-Safra classify the “approximately disconnected” chunks of X (they are collections of l-dimensional subspaces containing a fixed subspace of small dimension and contained in a fixed subspace of small codimension) and show that any approximate section of F is approximated by a section on some such chunk; this is all that is needed to prove the “2-to-2 games conjecture” in complexity theory, which is their ultimate goal.

So I found all this quite striking! Do questions about approximate global sections being approximated by global sections appear elsewhere? (The question as phrased here is already a bit weird from an algebraic geometry point of view, since it seems to require that you have or impose a probability measure on your set of open patches, but maybe that’s natural in some cases?)

Tagged , , ,

## Compressed sensing, compressed MDS, compressed clustering, and my talk tomorrow

I’m talking tomorrow at the spanking-new SILO seminar (“Systems, Learning, Information, and Optimization”) in the spanking-new Wisconsin Institute for Discovery.  As you may know, I don’t do any research in any of the four subjects appearing in the acronym!  But I’ve been following some work in the area ever since last year’s very successful Topology and Data seminar.  Here’s one semi-baked idea that will appear in the talk.

World’s most hurried introduction to compressed sensing (see Terry’s blog for a more thorough account.)

Think of an image on an N x N grid of pixels as a vector in R^(N^2).  We know that images can be compressed — that is, we can specify an image with reasonable accuracy using much less information.  That’s because most real-world images lie on or near a manifold in R^(N^2) of dimension much lower than N^2.

It’s already great that we can get away with fewer than N^2 coordinates to specify our image.  But it gets better still.  Maybe it’s expensive to record all N^2 pixels.  If we’re want to end up with (say) only N log N numbers, maybe we should try making only N log N measurements of our image in the first place?  If you just record a subset of N log N pixels, you’re probably in trouble; there may be localized features in the image that happen to be far from any of the pixels in your sparse set.  But if you measure N log N random linear functions on R^(N^2), it turns out that you’re very likely to have pinned down a single point on the manifold of interesting images!

Finite metric spaces

Suppose you’ve got a set S of objects, and a function d: S x S -> R, which you think of as a distance or dissimilarity function.  Could be texts, could be genomes, whatever.  An arbitrary finite metric space is a hard thing to get one’s head around, so it’s desirable to construct an embedding f: S -> R^k such that the Euclidean distance between f(x) and f(y) agrees as closely as possible with d(x,y).  (Think of k as fixed and small.)  The standard algorithm for this is called multidimensional scaling. It works well.  End of story.

Except — note that once again, we’ve compressed our data set.  It takes N^2 coordinates to specify a distance function, but only kN to specify an embedding.  So you can think of MDS as a compression algorithm for metric spaces!  And why does it work?  Because the set of N-point metrics that embed in R^k is a very low-dimensional submanifold of the N^2-dimensional space of general metrics.

So the example of compressed sensing leads us to ask — if you are willing to grant that your metric is approximately embeddable in R^k, can you reconstruct it from a sparse set of distances?  The answer is yes — you can do this with “Landmark MDS,” developed by Vin de Silva and Josh Tenenbaum, and, just as one might want, one can reconstruct the metric from a number of distances that’s linear in N.  I want to call this “Compressed MDS” — is that appropriate?  Will a randomly chosen set of cN distances specify the embedding with high probability?  Seems like it to me.

Multidimensional scaling is just one example of a metric embedding problem — you have a finite set S and a fixed metric space M (or, sometimes, a class of metrics from which M is to be drawn) and you want to find an embedding f: S -> M which respects distance as well as possible.  (This survey by Indyk and Matousek is a good resource for the general problem; and here’s a Newton Institute conference on metric embeddings from two weeks ago, with all the talks online for streaming.)  Of special interest is the case where M is required to be an ultrametric; to embed S in an ultrametric space is to identify S with the leaves of a weighted tree, which is more or less to impose a heirarchical clustering on S.  The space of ultrametrics on S is again of dimension much smaller than N^2, so, the heirarchical construction can be thought of as a compressed representation of the distance function d.  And again, one asks — can we reconstruct the ultrametric from a sparse set of distances?

Yes — a forthcoming paper of Eriksson, Dasarathy, Singh, and Nowak explains how.   N log N distances is enough.  (Update, Feb 2011:  now on the arXiv.)

Can one call this “compressed clustering”?