Difference sets missing a Hamming sphere

I tend to think of the Croot-Lev-Pach method (as used, for instance, in the cap set problem) as having to do with n-tensors, where n is bigger than 2.  But you can actually also use it in the case of 2-tensors, i.e. matrices, to say things that (as far as I can see) are not totally trivial.

Write m_d for the number of squarefree monomials in x_1, .. x_n of degree at most d; that is,

m_d = 1 + {n \choose 1} + {n \choose 2} + \ldots + {n \choose d}.

Claim:  Let P be a polynomial of degree d in F_2[x_1, .. x_n] such that P(0) = 1.  Write S for the set of nonzero vectors x such that P(x) = 1.  Let A be a subset of F_2^n such that no two elements of A have difference lying in S.  Then |A| < 2m_{d/2}.

Proof:  Write M for the A x A matrix whose (a,b) entry is P(a-b).  By the Croot-Lev-Pach lemma, this matrix has rank at most 2m_{d/2}.  By hypothesis on A, M is the identity matrix, so its rank is |A|.

Remark: I could have said “sum” instead of “difference” since we’re in F_2 but for larger finite fields you really want difference.

The most standard context in which you look for large subsets of F_2^n with restricted difference sets is that of error correcting codes, where you ask that no two distinct elements of A have difference with Hamming weight (that is, number of 1 entries) at most k.

It would be cool if the Croot-Lev-Pach lemma gave great new bounds on error-correcting codes, but I don’t think it’s to be.  You would need to find a polynomial P which vanishes on all nonzero vectors of weight larger than k, but which doesn’t vanish at 0. Moreover, you already know that the balls of size k/2 around the points of A are disjoint, which gives you the “volume bound”

|A| < 2^n / m_{k/2}.

I think that’ll be hard to beat.

If you just take a random polynomial P, the support of P will take up about half of F_2^n; so it’s not very surprising that a set whose difference misses that support has to be small!

Here’s something fun you can do, though.  Let s_i be the i-th symmetric function on x_1, … x_n.  Then

s_i(x) = {wt(x) \choose i}

where wt(x) denotes Hamming weight.  Recall also that the binomial coefficient

{k \choose 2^a}

is odd precisely when the a’th binary digit of k is 1.



is a polynomial of degree 2^b-1 which vanishes on x unless the last b digits of wt(x) are 0; that is, it vanishes unless wt(x) is a multiple of 2^b.  Thus we get:

Fact:  Let A be a subset of F_2^n such that the difference of two nonzero elements in A never has weight a multiple of 2^b.  Then

|A| \leq 2m_{2^{b-1} - 1}.

Note that this is pretty close to sharp!  Because if we take A to be the set of vectors of  weight at most 2^{b-1} – 1, then A clearly has the desired property, and already that’s half as big as the upper bound above.  (What’s more, you can throw in all the vectors of weight 2^{b-1} whose first coordinate is 1; no two of these sum to something of weight 2^b.  The Erdös-Ko-Rado theorem says you can do no better with those weight 2^{b-1} vectors.)

Is there an easier way to prove this?

When b=1, this just says that a set with no differences of even Hamming weight has size at most 2; that’s clear, because two vectors whose Hamming weight has the same parity differ by a vector of even weight.  Even for b=2 this isn’t totally obvious to me.  The result says that a subset of F_2^n with no differences of weight divisible by 4 has size at most 2+2n.  On the other hand, you can get 1+2n by taking 0, all weight-1 vectors, and all weight-2 vectors with first coordinate 1.  So what’s the real answer, is it 1+2n or 2+2n?

Write H(n,k) for the size of the largest subset of F_2^n having no two vectors differing by a vector of Hamming weight exactly k.  Then if 2^b is the largest power of 2 less than n, we have shown above that

m_{2^{b-1} - 1 } \leq H(n,2^b) \leq 2m_{2^{b-1} - 1}.

On the other hand, if k is odd, then H(n,k) = 2^{n-1}; we can just take A to be the set of all even-weight vectors!  So perhaps H(n,k) actually depends on k in some modestly interesting 2-adic way.

The sharpness argument above can be used to show that H(4m,2m) is as least

2(1 + 4m + {4m \choose 2} + \ldots + {4m \choose m-1} + {4m-1 \choose m-1}). (*)

I was talking to Nigel Boston about this — he did some computations which make it looks like H(4m,2m) is exactly equal to (*) for m=1,2,3.  Could that be true for general m?

(You could also ask about sets with no difference of weight a multiple of k; not sure which is the more interesting question…)

Update:  Gil Kalai points out to me that much of this is very close to and indeed in some parts a special case of the Frankl-Wilson theorem…  I will investigate further and report back!

Tagged , , ,

6 thoughts on “Difference sets missing a Hamming sphere

  1. David Speyer says:

    For the record, here is a report of a failure. I spent the evening thinking about the Hamming code problem. I can show that you do NOT get a nontrivial bound using any polynomial P where P(\vec{x}) depends only on the Hamming weight of \vec{x}.

    Notation: Set \eta(p) = - p \log p - (1-p) \log (1-p). So m_r \approx \exp( n \eta(r/n)). We want a polynomial P in n variables which is 1 at the origin and is supported on the Hamming ball of radius k. In order to get a nontrivial bound, we need 2 m_{d/2} < 2^n/m_{k/2}. Asymptotically, this means we need \eta(d/(2n)) + \eta(k/(2n)) < \log 2.

    We may as well assume that P uses only square-free monomials. The coefficient of the monomial t_{i_1} t_{i_2} \cdots t_{i_r} in P is \sum P(\vec{x}) where we sum over \vec{x} with x_j=0 for j \not \in \{ i_1, \ldots, i_r \}}.

    Now, it seems to me that the easiest way to get P supported on the Hamming ball is that P(\vec{x}) is a function of the Hamming weight of \vec{x}; say P(\vec{x}) = h(wt(x)). So the coefficient of t_{i_1} t_{i_2} \cdots t_{i_r} is \sum_{s=0}^k h(s) \binom{r}{s}. Call this F(r). We see that F(r) is an integer-valued polynomial of degree \leq k with F(r)=0 for d+1 \leq r \leq n. Also, since P(0)=1, the constant term of P is nonzero and F(0)=1.

    But induction on m shows that, if G is a nonzero integer valued polynomial with m consecutive zeroes, then \deg G \geq m. (Base case m=0 is clear, then replace G by G(x+1)-G(x) for the induction.) So n-d \leq \deg F \leq k and we get k+d \geq n. Set x=k/(2n) and y=d/(2n). So x+y \geq 1/2 and we want \eta(x)+\eta(y) < \log 2. Also, obviously, x and y<1/2.

    But \eta(x)+\eta(y) is convex down, so the minimum of \eta(x)+\eta(y) must occur at one of the corners of the triangle \{ x+y \geq 1/2, x,y \leq 1/2 \}, and the values at the corners are \log 2, \log 2 and 2 \log 2. We lose.

  2. David Speyer says:

    The formula which didn’t parse is that we sum over \vec{x} with x_j=0 for all j \not \in \{ i_1, \ldots, i_r \}.

  3. David Speyer says:

    Conceptually, the obstacle here seems to be a sort of “uncertainty principle”. Let P be a function on \mathbb{F}_2^n and let \hat{P} be the coefficients of the unique polynomial interpolating P with square-free monomials. You want P supported on the Hamming ball of radius k and \hat{P} on the Hamming ball of radius d$. It seems that it is impossible to have both d and k small. Would it be too rash to conjecture d+k \geq n?

  4. JSE says:

    This is an awesome phrasing!

  5. David Speyer says:

    In retrospect, I shouldn’t have expected to get a new bound, because everything I was doing was heading towards the leading term, and the volume bound on the leading term is already tight by Shannon’s argument about random codes. Of course, I might have gotten lucky and matched the leading terms, at which point I could see if I proved anything new about the remaining ones, but there was no reason to expect it.

    But I’m glad I thought about this, because the uncertainty formulation is a really nice question! I’m going to think about this more.

  6. David Speyer says:

    The uncertainty principle holds. Indeed, much more is true.

    Let p be any prime, let C = \{ 0,1,\ldots, p-1 \}^n and let P be a function C \to \mathbb{F}_p.

    $\displaystyle{\widehat{P}(y_1, y_2, \ldots, y_n) = \sum_{(x_1, \ldots, x_n) \in C} P(x_1, \ldots, x_n) y_1^{x_1} y_2^{x_2} \cdots y_n^{x_n} }.$

    Theorem: If P : C \to \mathbb{F}_p is any nonzero function, then there are \vec{x} and \vec{y} \in C with P(\vec{x}) \neq 0, \widehat{P}(\vec{y}) \neq 0 and \vec{x}+\vec{y} \geq (p-1,p-1, \ldots, p-1).

    Proof: We induct on n; the base case n=0 is vacuous.

    We’ll write C_{-} for \{ 0,1,\ldots, p-1 \}^{n-1}.
    Let d be maximal such that P restricted to C_{-} \times \{ d \} is nonzero.
    So \widehat{P} is a polynomial of degree d in the last variable.

    For \vec{x}_{-} \in C_{-}, let P_{-}(\vec{x}_{-}) = P(\vec{x}_{-}, d) and let \widehat{P}_{-} be the “Fourier” transform of P_{-}.
    So \widehat{P}_{-}(\vec{y}_{-}) is the leading coefficient of \widehat{P}(\vec{y}_{-}, z) as a polynomial in z.

    By induction, there are \vec{x}_{-} and \vec{y}_- in C_0 with \vec{x}_{-} + \vec{y}_{-} \geq (p-1,p-1,\ldots, p-1), such that P(\vec{x}_-, d) \neq 0 and \widehat{P}_0(\vec{y}_-) \neq 0. Then \widehat{P}(\vec{y}_-, z) is a degree d polynomial in z, so it has at most d roots. We deduce that \widehat{P}(\vec{y}_-, e) is nonzero for some e \in \{ p-d-1,p-d,p-d+1,\cdots, p-1 \}. Taking \vec{x} = (\vec{x}_-, d) and \vec{y} = (\vec{y}_-,e) proves the result. \square

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: