## Bounds for cap sets

Briefly:  it seems to me that the idea of the Croot-Lev-Pach paper I posted about yesterday can indeed be used to give a new bound on the size of subsets of F_3^n with no three-term arithmetic progression! Such a set has size at most (2.756)^n. (There’s actually a closed form for the constant, I think, but I haven’t written it down yet.)

Here’s the preprint. It’s very short. I’ll post this to the arXiv in a day or two, assuming I (or you) don’t find anything wrong with it, so comment if you have comments!

Update:  Busy few days of administrative stuff and travel, sorry for not having updated the preprint yet, will try to finish it today.  One note, already observed below in the comments:  you get a similar bound for subsets of (F_q)^n free of solutions to (ax+by+cz=0) for any (a,b,c) with a+b+c=0; the cap set case is q=3, (a,b,c) = (1,1,1).

Update 2:  Dion Gijswijt and I will be submitting this result as a joint paper, which will amalgamate the presentations of our essentially identical arguments.  Dion carried out his work independently of mine at around the same time, and the idea should be credited to both of us.

## Croot-Lev-Pach on AP-free sets in (Z/4Z)^n

As you know I love the affine cap problem:  how big can a subset of (Z/3Z)^n be that contains no three elements summing to 0 — or, in other words, that contains no 3-term arithmetic progression?  The best upper bounds, due to Bateman and Katz, are on order 3^n / n^(1+epsilon).  And I think it’s fair to say that all progress on this problem, since Meshulam’s initial results, have come from Fourier-analytic arguments.

So I’m charmed by this paper of Ernie Croot, Vsevolod Lev, and Peter Pach which proves a much stronger result for A = (Z/4Z)^n:  a subset with no 3-term arithmetic progression has size at most c^n for c strictly less than 4.  Better still (for an algebraic geometer) the argument has no harmonic analysis at all, but proceeds via the polynomial method!

This is surprising for two reasons.  First, it’s hard to make the polynomial method work well for rings, like Z/4Z, that aren’t fields; extending our knowledge about additive combinatorics to such settings is a long-standing interest of mine.  Second, the polynomial method over finite fields usually works in the “fixed dimension large field” regime; problems like affine cap, where the base ring is fixed and the dimension are growing, have so far been mostly untouched.

As for the first issue, here’s the deal.  This looks like a problem over Z/4Z but is really a problem over F_2, because the condition for being a 3-term AP

a – 2b + c = 0

has a 2 in it.  In other words:  the two outer terms have to lie in the same coset of 2A, and the middle term is only determined up to 2A.

So CLP recast the problem as follows.  Let S be a large subset of A with no 3-term AP.   Let V be 2A, which is an n-dimensional vector space over F_2.  For each v in V, there’s a coset of V consisting of the solutions to 2a = v, and we can let S_v be the intersection of S with this coset.

We want to make this a problem about V, not about A.  So write T_v for a translate of S_v by some element of the coset, so T_v now sits in V.  Which element?  Doesn’t matter!

We can now write the “no 3-term AP” condition strictly in terms of these subsets of V.  Write (T_v – T_v)^* for the set of differences between distinct elements of T_v.  Write U for the set of v in V such that T_v is nonempty.  Then the union over all v in U of

(T_v – T_v)^* + v

is disjoint from U.

I leave it as an exercise to check the equivalence.

Now we have a combinatorial question about vector spaces over F_2; we want to show that, under the condition above, the sum of |T_v| over all v in U can’t be too large.

This is where the polynomial method comes in!  CLP show that (over any field, not just F_2), a polynomial of low degree vanishing on (T_v – T_v)^* has to vanish at 0 as well; this is Lemma 1 in their paper.  So write down a polynomial P vanishing on V – U; by dimension considerations we can choose one which doesn’t vanish on all of V.  (This uses the fact that the squarefree monomials of degree up to d are linearly independent functions on F_2^n.)  If U is big, we can choose P to have lowish degree.

Since P vanishes on V-U, P has to vanish on (T_v – T_v)^* + v for all v.  Since P has low degree, it has to vanish on v too, for all v.  But then P vanishes everywhere, contrary to our assumption.

The magic of the paper is in Lemma 1, in my view, which is where you really see the polynomial method applied in this unusual fixed-field-large-dimension regime.  Let me say a vague word about how it works.  (The actual proof is less than a page, by the way, so I’m not hiding much!)  Let P be your polynomial and d its degee.  You send your vector space into a subvariety of a much larger vector space W via degree-d Veronese embedding F_d. In fact you do this twice, writing

V x V -> W x W.

Now if P is your polynomial of degree-d, you can think of P(v_1 – v_2) as a bilinear form <,> on W x W.  Suppose S is a subset of V such that P(s_1 – s_2) vanishes for all distinct s_1, s_2 in S.   That means

<F_d(s_1), F_d(s_2)> = 0

for all distinct s_1, s_2 in S.  On the other hand,

<F_d(s_1), F_d(s_1)>

doesn’t depend on s_1; it just takes the value P(0).  So if P(0) is not equal to 0, you have |S| vectors of nonzero norm which are mutually orthogonal under this bilinear form, and so there can be at most dim W of these, and that’s the bound on |S| you need.

This is very slick and I hope the idea is more generally applicable!

## An incidence conjecture of Bourgain over fields of positive characteristic (with Hablicsek)

Marci Hablicsek (a finishing Ph.D. student at UW) and I recently posted a new preprint, “An incidence conjecture of Bourgain over fields of finite characteristic.”

The theme of the paper is a beautiful theorem of Larry Guth and Nets Katz, one of the early successes of Dvir’s “polynomial method.”  They proved a conjecture of Bourgain:

Given a set S of points in R^3, and a set of N^2 lines such that

• No more than N lines are contained in any plane;
• Each line contains at least N points of S;

then S has at least cN^3 points.

In other words, the only way for a big family of lines to have lots of multiple intersections is for all those lines to be contained in a plane.  (In the worst case where all the lines are in a plane, the incidences between points and lines are governed by the Szemeredi-Trotter theorem.)

I saw Nets speak about this in Wisconsin, and I was puzzled by the fact that the theorem only applied to fields of characteristic 0, when the proof was entirely algebraic.  But you know the proof must fail somehow in characteristic p, because the statement isn’t true in characteristic p.  For example, over the field k with p^2 elements, one can check that the Heisenberg surface

$X: x - x^p + yz^p - zy^p = 0$

has a set of p^4 lines, no more than p lying on any plane, and such that each line contains at least p^2 elements of X(k).  If the Guth-Katz theorem were true over k, we could take N = p^2 and conclude that |X(k)| is at least p^6.  But in fact, it’s around p^5.

It turns out that there is one little nugget in the proof of Guth-Katz which is not purely algebraic.  Namely:  they show that a lot of the lines are contained in some surface S with the following property;  at every smooth point s of S, the tangent plane to S at s intersects S with multiplicity greater than 2.  They express this in the form of an assertion that a certain curvature form vanishes everywhere.  In characteristic 0, this implies that S is a plane.  But not so in characteristic p!  (As always, the fundamental issue is that a function in characteristic p can have zero derivative without being constant — viz., x^p.)  All of us who did the problems in Hartshorne know about the smooth plane curve over F_3 with every point an inflection point.  Well, there are surfaces like that too (the Heisenberg surface is one such) and the point of the new paper is to deal with them.  In fact, we show that the Guth-Katz theorem is true word for word as long as you prevent lines not only from piling up in planes but also from piling up in these “flexy” surfaces.

It turns out that any such surface must have degree at least p, and this enables us to show that the Guth-Katz theorem is actually true, word for word, over the prime field F_p.

If you like, you can think of this as a strengthening of Dvir’s theorem for the case of F_p^3.  Dvir proves that a set of p^2 lines with no two lines in the same direction fills up a positive-density subset of the whole space.  What we prove is that the p^2 lines don’t have to point in distinct directions; it is enough to impose the weaker condition that no more than p of them lie in any plane; this already implies that the union of the lines has positive density.  Again, this strengthening doesn’t hold for larger finite fields, thanks to the Heisenberg surface and its variants.

This is rather satisfying, in that there are other situations in this area (e.g. sum-product problems) where there are qualitatively different bounds depending on whether the field k in question has nontrivial subfields or not.  But it is hard to see how a purely algebraic argument can “see the difference” between F_p and F_{p^2}.  The argument in this paper shows there’s at least one way this can happen.

Satisfying, also, because it represents an unexpected application for some funky characteristic-p algebraic geometry!  I have certainly never needed to remember that particular Hartshorne problem in my life up to now.

## This Week’s Finds In Number Theory

Twenty years ago yesterday, John Baez posted the first installment of This Week’s Finds in Mathematical Physics.  In so doing, he invented the math blog, and, quite possibly, the blog itself.  A lot of mathematicians of my generation found in John’s blog an accessible, informal, but never dumbed-down window beyond what we were learning in classes, into the messy and contentious ground of current research.  And everybody who blogs now owes him a gigantic debt.

In his honor I thought it would be a good idea to post a “This Week’s Finds” style post of my own, with capsule summaries of a few papers I’ve recently noted with pleasure and interest.  I won’t be able to weave these into a story the way John often did, though!  Nor will there be awesome ASCII graphics.  Nor will any of the papers actually be from this week, because I’m a little behind on my math.NT abstract scanning.

If you run a math blog, please consider doing the same in your own field!  I’ll link to it.

Update:  It begins!  Valeria de Palva offers This Week’s Finds In Categorical Logic.  And Matt Ward, a grad student at UW-Seattle, has This Week’s Finds in Arithmetic Geometry.

1)  “On sets defining few ordinary lines,” by Ben Green and Terry Tao.

The idea that has launched a thousand papers in additive combinatorics:  if you are a set approximately closed under some kind of relation, then you are approximately a set which is actually closed under that kind of relation.  Subset of a group mostly closed under multiplication?  You must be close to an honest subgroup.  Subset of Z with too many pair-sums agreeing?  You have an unusually large intersection with an authentic arithmetic progression.  And so on.

This new paper considers the case of sets in R^2 with few ordinary lines; that is, sets S such that most lines that intersect S at all intersect S in three or more points.  How can you cook up a set of points with this property?  There are various boring ways, like making all the points collinear.  But there’s only one interesting way I can think of:  have the points form an “arithmetic progression” …,-3P,-2P, -P, P,2P,3P, …. in an elliptic curve!  (A finite subgroup also works.)  Then the usual description of the group law on the curve tells us that the line joining two points of S quite often passes through a third.  Green and Tao prove a remarkable quasi-converse to this fact:  if a set has few ordinary lines, it must be concentrated on a cubic algebraic curve!  This might be my favorite “approximately structured implies approximates a structure” theorem yet.

2) “Asymptotic behavior of rational curves,” by David Bourqui.  Oh, I was about to start writing this but when I searched I realized I already blogged about this paper when it came out!  I leave this here because the paper is just as interesting now as it was then…

3) “The fluctuations in the number of points of smooth plane curves over finite fields,” by Alina Bucur, Chantal David, Brooke Feigon, and Matilde Lalin;

“The probability that a complete intersection is smooth,” by Alina Bucur and Kiran Kedlaya;

“The distribution of the number of points on trigonal curves over F_q,” by Melanie Matchett Wood;

“Semiample Bertini theorems over finite fields,” by Daniel Erman and Melanie Matchett Wood.

How many rational points does a curve over F_q have?  We discussed this question here a few years ago, coming to no clear conclusion.  I still maintain that if the curve is understood to vary over M_g(F_q), with q fixed and g growing, the problem is ridiculously hard.

But in more manageable families of curves, we now know a lot more than we did in 2008.

You might guess, of course, that the average number of points should be q+1; if you have to reason to think of Frobenius as biased towards having positive or negative trace, why not guess that the trace, on average, is 0?  Bucur-David-Feigon-Lalin prove that this is exactly the case for a random smooth plane curve.  It’s not hard to check that this holds for a random hyperelliptic curve as well.  But for a random trigonal curve, Wood proves that the answer is different — the average is slightly less than q+2!

Where did the extra point come from?

Here’s one way I like to think of it.  This is very vague, and proves nothing, of course.  The trigonal curve X has a degree-3 map to P^1, which is ramified at some divisor D in P^1.  If D is a random divisor, it has one F_q-point on average.  How many F_q-points on X lie over each rational point P of D?  Well, generically, the ramification is going to be simple, and this means that there are two rational points over D; the branch point, and the unique unramified point.  Over every other F_q-point of D, the Frobenius action on the preimage in X should be a random element of S_3, with an average of one fixed point.  To sum up, in expectation we should see q rational points of X over q non-branch rational points of P^1, and 2 rational points of X over a single rational branch point in P^1, for a total of q+2.

(Erman and Wood, in a paper released just a few months ago, prove much more general results of a similar flavor about smooth members of linear systems on P^1 x P^1 (or other Hirzebruch surfaces, or other varieties entirely) which are semiample; for instance, they may have a map to P^1 which stays constant in degree, while their intersection with another divisor gets larger and larger.)

Most mysterious of all is the theorem of Bucur and Kedlaya, which shows (among other things) that if X is a random smooth intersection of two hypersurfaces of large degree in P^3, then the size of |X(F_q)| is slightly less than q+1 on average.  For this phenomenon I have no heuristic explanation at all.  What’s keeping the points away?

## A sum-product theorem in function fields (Bloom, Jones)

About a year ago I wrote about the handsome theorem of Dummit and Hablicsek that there exist Besicovich sets over F_q[[t]]; that is, there are sets of measure 0 which contain a line in every direction.  As I explained in that post, power series rings over finite fields are promising intermediate contexts for problems in additive combinatorics; like the real numbers, they have infinitely many scales, but unlike the real numbers, those scales naturally form a discrete set, and the whole ring decomposes in some iterative sense into a bunch of copies of a finite field, where things are much simpler (though by no means simple!)

But Dummit and Habliscek remained the only paper on additive combinatorics over non-archimedean local rings — until now!  A new preprint by Bloom and Jones (Ph.D. students at Bristol) shows that if A is a finite subset of the ring of finite-headed Laurent series F_q((1/t)), then

$max(|A+A|, |A \times A|) > |A|^{6/5 - \epsilon}$.

The implicit constant depends on q, as it must, since one can get |A + A| = |A x A| = |A| by taking A to be the field of constants F_q itself.

This kind of sum-product problem has been the subject of sustained interest for a long time; the exponent is supposed to be 2 (i.e, either the sums are more or less distinct from each other or the products are) but nothing close to this has been obtained in any case.  The exponent here is better than the best known for finite fields of prime order, but worse than the best known for the real numbers.

I haven’t read the paper yet, so I can’t say anything incisive about the methods, but it’s a striking result!

## “Kakeya sets over non-archimedean local rings,” by Dummit and Hablicsek

A new paper posted this week on the arXiv this week by UW grad students Evan Dummit and Márton Hablicsek answers a question left open in a paper of mine with Richard Oberlin and Terry Tao.  Let me explain why I was interested in this question and why I like Evan and Marci’s answer so much!

Recall:  a Kakeya set in an n-dimensional vector space over a field k is a set containing a line (or, in the case k = R, a unit line segment) in every direction.  The “Kakeya problem,” phrased loosely, is to prove that Kakeya sets cannot be too small.

But what does “small” mean?  You might want it to mean “measure 0” but for the small but important fact that in this interpretation the problem has a negative answer:  as Besicovitch discovered in 1919, there are Kakeya sets in R^2 with measure 0!  So Kakeya’s conjecture concerns a stronger notion of “small”  — he conjectures that a Kakeya set in R^n cannot have Hausdorff or Minkowski dimension strictly smaller than n.

(At this point, if you haven’t thought about the Kakeya conjecture before, you might want to read Terry’s long expository post about the Kakeya conjecture and Dvir’s theorem; I cannot do it any better here.)

The big recent news in this area, of course, is Dvir’s theorem that that the Kakeya conjecture is true when k is a finite field.

Of course one hopes that Dvir’s argument will give some ideas for an attack on the original problem in R^n.  And that hasn’t happened yet; though the “polynomial method,” as the main idea of Dvir’s theorem is now called, has found lots of applications to other problems in real combinatorial geometry (e.g. Guth and Katz’s proof of the joints conjecture.)

Why not Kakeya?  Well, here’s one clue.  Dvir actually proves more than the Kakeya conjecture!  He proves that a Kakeya set in F_q^n has positive measure.

(Note:  F_q^n is a finite set, so of course any nonempty subset has positive measure; so “positive measure” here is shorthand for “there’s a lower bound for the measure which is bounded away from 0 as q grows with n fixed.”)

What this tells you is that R really is different from F_q with respect to this problem; if Dvir’s proof “worked” over R, it would prove that a Kakeya set in R^n had positive measure, which is false.

So what’s the difference between R and F_q?  In my view, it’s that R has multiple scales, while F_q only has one.  Two elements in F_q are either the same or distinct, but there is nothing else going on metrically, while distinct real lines can be very close together or very far apart.  The interaction between distances at different scales is your constant companion when working on these problems in the real setting; so maybe it’s not so shocking that a one-scale field like F_q is not a perfect model for the phenomena we’re trying to study.

Which leads us to the ring F_q[[t]] — the “non-archimedean local ring” which Dummit and Hablicsek write about.  This ring is somehow “in between” finite fields and real numbers.  On the one hand, it is “profinite,” which is to say it is approximated by a sequence of larger and larger finite rings F_q[[t]]/t^k.  On the other hand, it has infinitely many scales, like R.  From the point of view of Kakeya sets, is it more like a finite field, or more like the real numbers?  In particular, does it have Kakeya sets of measure 0, making it potentially a good model for the real Kakeya problem?

This is the question Richard, Terry, and I asked, and Evan and Marci show that the answer is yes; they construct explicitly a Kakeya set in F_q[[t]]^2 with measure 0.

Now when we asked this question in our paper, I thought maybe you could do this by imitating Besicovitch’s argument in a straightforward way.  I did not succeed in doing this.  Evan and Marci tried too, and they told me that this just plain doesn’t work.  The construction they came up with is (at least as far as I can see) completely different from anything that makes sense over R.  And the way they prove measure 0 is extremely charming; they define a Markov process such for which the complement of their Kakeya set is the set of points that eventually hit 0, and then show by standard methods that their Markov process goes to 0 with probability 1!

Of course you ask:  does their Kakeya set have Minkowski dimension 2?  Yep — and indeed, they prove that any Kakeya set in F_q[[t]]^2 has Minkowski dimension 2, thus proving the Kakeya conjecture in this setting, up to the distinction between Hausdorff and Minkowski dimension.  (Experts should feel free to weigh in an tell me how much we should worry about this distinction.)  Note that dimension 2 is special:  the Kakeya conjecture in R^2 is known as well.  For every n > 2 we’re in the dark, over F_q[[t]] as well as over R.

To sum up:  what Dummit and Hablicsek prove makes me feel like the Kakeya problem over  F_q[[t]] is, at least potentially, a pretty good model for the Kakeya problem over R!  Not that we know how to solve the Kakeya problem over F_q[[t]]…..

## Gowers & co. on the cap set problem

On the off chance you read my blog and not Gowers’ — Tim is talking about one of my very favorite open questions, the affine cap set problem, over at his place.

I’m a little ambivalent about reading his posts — every time I think about this problem, I get sucked in and spend a certain amount of time gnawing at it.  And the sum total of all this gnawing has so far produced not even the tiniest toothmark on the gleaming surface of the affine cap set problem.

And yet…. and yet…..