Here are three functions. Let N be an integer, and consider:
- G_1(N), the size of the largest subset S of 1..N containing no 3-term arithmetic progression;
- G_2(N), the largest M such that there exist subsets S,T of 1..N with |S| = |T| = M such that the equation s_i + t_i = s_j + t_k has no solutions with (j,k) not equal to (i,i). (This is what’s called a tri-colored sum-free set.)
- G_3(N), the largest M such that the following is true: given subsets S,T of 1..N, there always exist subsets S’ of S and T’ of T with |S’| + |T’| = M and
You can see that G_1(N) <= G_2(N) <= G_3(N). Why? Because if S has no 3-term arithmetic progression, we can take S = T and s_i = t_i, and get a tri-colored sum-free set. Now suppose you have a tri-colored sum-free set (S,T) of size M; if S’ and T’ are subsets of S and T respectively, and , then for every pair (s_i,t_i), you must have either s_i in S’ or t_i in T’; thus |S’| + |T’| is at least M.
When the interval 1..N is replaced by the group F_q^n, the Croot-Lev-Pach-Ellenberg-Gijswijt argument shows that G_1(F_q^n) is bounded above by the number of monomials of degree at most (q-1)n/3; call this quantity M(F_q^n). In fact, G_3(F_q^n) is bounded above by M(F_q^n), too (see the note linked from this post) and the argument is only a modest extension of the proof for G_1. For all we know, G_1(F_q^n) might be much smaller, but Kleinberg has recently shown that G_2(F_2^n) (whence also G_3(F_2^n)) is equal to M(F_2^n) up to subexponential factors, and work in progress by Kleinberg and Speyer has shown this for several more q and seems likely to show that the bound is tight in general. On the other hand, I have no idea whether to think G_1(F_q^n) is actually equal to M(F_q^n); i.e. is the bound proven by me and Dion sharp?
The behavior of G_1(N) is, of course, very much studied; we know by Behrend (recently sharpened by Elkin) that G_1(N) is at least N/exp(c sqrt(log N)). Roth proved that G_1(N) = o(N), and the best bounds, due to Tom Sanders, show that G_1(N) is O(N(log log N)^5 / log N). (Update: Oops, no! Thomas Bloom has an upper bound even a little better than Sanders, change that 5 to a 4.)
What about G_2(N) and G_3(N)? I’m not sure how much people have thought about these problems. But if, for instance, you could show (for example, by explicit constructions) that G_3(N) was closer to Sanders than to Behrend/Elkin, it would close off certain strategies for pushing the bound on G_1(N) downward. (Update: Jacob Fox tells me that you can get an upper bound for G_2(N) of order N/2^{clog* N} from his graph removal paper, applied to the multicolored case.)
Do we think that G_2(N) and G_3(N) are basically equal, as is now known to be the case for F_q^n?