I tend to think of the Croot-Lev-Pach method (as used, for instance, in the cap set problem) as having to do with n-tensors, where n is bigger than 2. But you can actually also use it in the case of 2-tensors, i.e. matrices, to say things that (as far as I can see) are not totally trivial.

Write m_d for the number of squarefree monomials in x_1, .. x_n of degree at most d; that is,

**Claim: **Let P be a polynomial of degree d in F_2[x_1, .. x_n] such that P(0) = 1. Write S for the set of nonzero vectors x such that P(x) = 1. Let A be a subset of F_2^n such that no two elements of A have difference lying in S. Then |A| < 2m_{d/2}.

**Proof:** Write M for the A x A matrix whose (a,b) entry is P(a-b). By the Croot-Lev-Pach lemma, this matrix has rank at most 2m_{d/2}. By hypothesis on A, M is the identity matrix, so its rank is |A|.

*Remark:* I could have said “sum” instead of “difference” since we’re in F_2 but for larger finite fields you really want difference.

The most standard context in which you look for large subsets of F_2^n with restricted difference sets is that of *error correcting codes*, where you ask that no two distinct elements of A have difference with Hamming weight (that is, number of 1 entries) at most k.

It would be cool if the Croot-Lev-Pach lemma gave great new bounds on error-correcting codes, but I don’t think it’s to be. You would need to find a polynomial P which vanishes on all nonzero vectors of weight larger than k, but which doesn’t vanish at 0. Moreover, you already know that the balls of size k/2 around the points of A are disjoint, which gives you the “volume bound”

|A| < 2^n / m_{k/2}.

I think that’ll be hard to beat.

If you just take a random polynomial P, the support of P will take up about half of F_2^n; so it’s not very surprising that a set whose difference misses that support has to be small!

Here’s something fun you can do, though. Let s_i be the i-th symmetric function on x_1, … x_n. Then

where wt(x) denotes Hamming weight. Recall also that the binomial coefficient

is odd precisely when the a’th binary digit of k is 1.

Thus,

is a polynomial of degree 2^b-1 which vanishes on x unless the last b digits of wt(x) are 0; that is, it vanishes unless wt(x) is a multiple of 2^b. Thus we get:

**Fact: **Let A be a subset of F_2^n such that the difference of two nonzero elements in A never has weight a multiple of 2^b. Then

.

Note that this is pretty close to sharp! Because if we take A to be the set of vectors of weight at most 2^{b-1} – 1, then A clearly has the desired property, and already that’s half as big as the upper bound above. (What’s more, you can throw in all the vectors of weight 2^{b-1} whose first coordinate is 1; no two of these sum to something of weight 2^b. The Erdös-Ko-Rado theorem says you can do no better with those weight 2^{b-1} vectors.)

Is there an easier way to prove this?

When b=1, this just says that a set with no differences of even Hamming weight has size at most 2; that’s clear, because two vectors whose Hamming weight has the same parity differ by a vector of even weight. Even for b=2 this isn’t totally obvious to me. The result says that a subset of F_2^n with no differences of weight divisible by 4 has size at most 2+2n. On the other hand, you can get 1+2n by taking 0, all weight-1 vectors, and all weight-2 vectors with first coordinate 1. So what’s the real answer, is it 1+2n or 2+2n?

Write H(n,k) for the size of the largest subset of F_2^n having no two vectors differing by a vector of Hamming weight exactly k. Then if 2^b is the largest power of 2 less than n, we have shown above that

.

On the other hand, if k is odd, then H(n,k) = 2^{n-1}; we can just take A to be the set of all even-weight vectors! So perhaps H(n,k) actually depends on k in some modestly interesting 2-adic way.

The sharpness argument above can be used to show that H(4m,2m) is as least

I was talking to Nigel Boston about this — he did some computations which make it looks like H(4m,2m) is *exactly* equal to (*) for m=1,2,3. Could that be true for general m?

(You could also ask about sets with no difference of weight a *multiple *of k; not sure which is the more interesting question…)

**Update**: Gil Kalai points out to me that much of this is very close to and indeed in some parts a special case of the Frankl-Wilson theorem… I will investigate further and report back!

For the record, here is a report of a failure. I spent the evening thinking about the Hamming code problem. I can show that you do NOT get a nontrivial bound using any polynomial P where depends only on the Hamming weight of .

Notation: Set . So . We want a polynomial in variables which is at the origin and is supported on the Hamming ball of radius . In order to get a nontrivial bound, we need . Asymptotically, this means we need .

We may as well assume that uses only square-free monomials. The coefficient of the monomial in is where we sum over with for .

Now, it seems to me that the easiest way to get supported on the Hamming ball is that is a function of the Hamming weight of ; say . So the coefficient of is . Call this . We see that is an integer-valued polynomial of degree with for . Also, since , the constant term of is nonzero and .

But induction on shows that, if is a nonzero integer valued polynomial with consecutive zeroes, then . (Base case is clear, then replace by for the induction.) So and we get . Set and . So and we want . Also, obviously, and .

But is convex down, so the minimum of must occur at one of the corners of the triangle , and the values at the corners are , and . We lose.

The formula which didn’t parse is that we sum over with for all .

Conceptually, the obstacle here seems to be a sort of “uncertainty principle”. Let be a function on and let be the coefficients of the unique polynomial interpolating with square-free monomials. You want supported on the Hamming ball of radius and on the Hamming ball of radius d$. It seems that it is impossible to have both and small. Would it be too rash to conjecture ?

This is an awesome phrasing!

In retrospect, I shouldn’t have expected to get a new bound, because everything I was doing was heading towards the leading term, and the volume bound on the leading term is already tight by Shannon’s argument about random codes. Of course, I might have gotten lucky and matched the leading terms, at which point I could see if I proved anything new about the remaining ones, but there was no reason to expect it.

But I’m glad I thought about this, because the uncertainty formulation is a really nice question! I’m going to think about this more.

The uncertainty principle holds. Indeed, much more is true.

Let be any prime, let and let be a function .

Define

$\displaystyle{\widehat{P}(y_1, y_2, \ldots, y_n) = \sum_{(x_1, \ldots, x_n) \in C} P(x_1, \ldots, x_n) y_1^{x_1} y_2^{x_2} \cdots y_n^{x_n} }.$

Theorem: If is any nonzero function, then there are and with , and .

Proof: We induct on ; the base case is vacuous.

We’ll write for .

Let be maximal such that restricted to is nonzero.

So is a polynomial of degree in the last variable.

For , let and let be the “Fourier” transform of .

So is the leading coefficient of as a polynomial in .

By induction, there are and in with , such that and . Then is a degree polynomial in , so it has at most roots. We deduce that is nonzero for some . Taking and proves the result.