Category Archives: math


The job fell to me of giving an overview talk about Benson Farb’s entire career at his birthday conference last fall.  Hard task!  I didn’t cover nearly everything but I think I gave a decent impression of what Farbisme is all about.

Update:  For WordPressreasons, you can’t watch the video within this page, but if you click through to Vimeo you can watch it!


Tagged ,

Wanlin Li, “Vanishing of hyperelliptic L-functions at the central point”

My Ph.D. student Wanlin Li has posted her first paper!  And it’s very cool.  Here’s the idea.  If chi is a real quadratic Dirichlet character, there’s no reason the special value L(1/2,chi) should vanish; the functional equation doesn’t enforce it, there’s no group whose rank is supposed to be the order of vanishing, etc.  And there’s an old conjecture of Chowla which says the special value never vanishes.  On the very useful principle that what needn’t happen doesn’t happen.

Alexandra Florea (last seen on the blog here)  gave a great seminar here last year about quadratic L-functions over function fields, which gave Wanlin the idea of thinking about Chowla’s conjecture in that setting.  And something interesting developed — it turns out that Chowla’s conjecture is totally false!  OK, well, maybe not totally false.  Let’s put it this way.  If you count quadratic extensions of F_q(t) up to conductor N, Wanlin shows that at least c N^a of the corresponding L-functions vanish at the center of the critical strip.  The exponent a is either 1/2,1/3, or 1/5, depending on q.  But it is never 1.  Which is to say that Wanlin’s theorem leaves open the possibility that o(N) of the first N hyperelliptic L-functions vanishes at the critical point.  In other words, a density form of Chowla’s conjecture over function fields might still be true — in fact, I’d guess it probably is.

The main idea is to use some algebraic geometry.  To say an L-function vanishes at 1/2 is to say some Frobenius eigenvalue which has to have absolute value q^{1/2} is actually equal to q^{1/2}.  In turn, this is telling you that the hyperelliptic curve over F_q whose L-function you’re studying has a map to some fixed elliptic curve.  Well, that’s something you can make happen by physically writing down equations!  Of course you also need a lower bound for the number of distinct quadratic extensions of F_q(t) that arise this way; this is the most delicate part.

I think it’s very interesting to wonder what the truth of the matter is.  I hope I’ll be back in a few months to tell you what new things Wanlin has discovered about it!


Tagged , , , , , ,

How I won the talent show

So I don’t want to brag or anything but I won the neighborhood talent show tonight.  Look, here’s my trophy:

My act was “The Great Squarerootio.”  I drank beer and computed square roots in my head.  So I thought it might be fun to explain how to do this!  Old hat for my mathematician readers, but this is elementary enough for everyone.

Here’s how it works.  Somebody in the audience said “752.”  First of all, you need to know your squares.  I think I know them all up to 36^2 = 1296.  In this case, I can see that 752 is between 27^2 = 729 and 28^2 = 784, a little closer to 729.  So my first estimate is 27.  This is not too bad:  27^2 = 729 is 23 away from 752, so my error is 23.

Now here’s the rule that makes things go:

new estimate = old estimate + error/(2*old estimate).

In this case, the error is 23 and my old estimate is 27, so I should calculate 23/2*27; well, 23 is just a bit less than 27, between 10 and 20% less, so 23/2*27 is a bit less than 1/2, but should be bigger than 0.4.

So our new estimate is “27.4 something.”

Actual answer: 27.4226…

Actual value of 27 + 23/54: 27.4259…

so I probably would have gotten pretty close on the hundredth place if I’d taken more care with estimating the fraction.  Of course, another way to get even closer is to take 27.4 as your “old estimate” and repeat the process!  But then I’d have to know the square of 27.4, which I don’t by heart, and computing it mentally would give me some trouble.

Why does this work?  The two-word answer is “Newton’s method.”  But let me give a more elementary answer.

Suppose x is our original estimate, and n is the number we’re trying to find the square root of, and e is the error; that is, n = x^2 + e.

We want to know how much we should change x in order to get a better estimate.  So let’s say we modify x by d.  Then

(x+d)^2 = x^2 + 2xd + d^2

Now let’s say we’ve wisely chosen x to be the closest integer to the square root of n, so d should be pretty small, less than 1 at any rate; then d^2 is really small.  So small I will decide to ignore it, and estimate

(x+d)^2 = x^2 + 2xd.

What we want is

(x+d)^2 = n = x^2 +e.

For this to be the case, we need e = 2xd, which tells us we should take d = e/2x; that’s our rule above!

Here’s another way to think about it.  We’re trying to compute 752^(1/2), right?  And 752 is 729+23.  So what is (729+23)^(1/2)?

If you remember the binomial theorem from Algebra 2, you might remember that it tells you how to compute (x+y)^n for any n.  Well, any positive whole number n, right?  That’s the thing!  No!  Any n!  It works for fractions too!  Your Algebra 2 teacher may have concealed this awesome fact from you!  I will not be so tight-lipped.

The binomial theorem says

(x+y)^n = x^n + {n \choose 1} x^{n-1} y + {n \choose 2} x^{n-2} y^2 + \ldots +

Plugging in x=729,y=23,n=1/2 we get

(729+23)^{1/2} = 729^{1/2} + {1/2 \choose 1} 729^{-1/2} \cdot 23 + {1/2 \choose 2} 729^{-3/2} \cdot 23^2 + \ldots

Now 729^{1/2} we know; that’s 27.  What is the binomial coefficient 1/2 choose 1?  Is it the number of ways to choose 1 item out of 1/2 choices?  No, because that makes no sense.  Rather:  by “n choose 1” we just mean n, by “n choose 2” we just mean (1/2)n(n-1), etc.

So we get

(729+23)^{1/2} = 27 + (1/2) \cdot 23 / 27 + (-1/8) \cdot 23^2 / 27^3 + \ldots

And look, that second term is 23/54 again!  So the Great Squarerootio algorithm is really just “use the first two terms in the binomial expansion.”

To sum up:  if you know your squares up to 1000, and you can estimate fractions reasonably fast, you can get a pretty good mental approximation to the square root of any three-digit number.  Even while drinking beer!  You might even be able to beat out cute kids in your neighborhood and win a big honking cup!


Tagged , , ,

Trace test

Jose Rodriguez gave a great seminar here yesterday about his work on the trace test, a numerical way of identifying irreducible components of varieties.  In Jose’s world, you do a lot of work with homotopy; if a variety X intersects a linear subspace V in points p1, p2, .. pk, you can move V a little bit and numerically follow those k points around.  If you move V more than a little bit — say in a nice long path in the Grassmannian that loops around and returns to its starting point — you’ll come back to p1, p2, .. pk, but maybe in a different order.  In this way you can compute the monodromy of those points; if it’s transitive, and if you’re careful about avoiding some kind of discriminant locus, you’ve proven that p1,p2…pk are all on the same component of V.

But the trace test is another thing; it’s about short paths, not long paths.  For somebody like me, who never thinks about numerical methods, this means “oh we should work in the local ring.”  And then it says something interesting!  It comes down to this.  Suppose F(x,y) is a form (not necessarily homogenous) of degree at most d over a field k.  Hitting it with a linear transformation if need be, we can assume the x^d term is nonzero.  Now think of F as an element of k((y))[x]:  namely

F = x^d + a_1(y) x^{d-1} + \ldots + a_d(y).

Letting K be the algebraic closure of k((y)), we can then factor F as (x-r_1) … (x-r_d).  Each of these roots can be computed as explicitly to any desired precision by Hensel’s lemma.  While the r_i may be power series in k((y)) (or in some algebraic extension), the sum of the r_i is -a_1(y), which is a linear function A+by.

Suppose you are wondering whether F factors in k[x,y], and whether, for instance, r_1 and r_2 are the roots of an irreducible factor of F.  For that to be true, r_1 + r_2 must be a linear function of y!  (In Jose’s world, you grab a set of points, you homotopy them around, and observe that if they lie on an irreducible component, their centroid moves linearly as you translate the plane V.)

Anyway, you can falsify this easily; it’s enough for e.g. the quadratic term of r_1 + r_2 to be nonzero.  If you want to prove F is irreducible, you just check that every proper subset of the r_i sums to something nonlinear.

  1.  Is this something I already know in another guise?
  2.  Is there a nice way to express the condition (which implies irreducibility) that no proper subset of the r_i sums to something with zero quadratic term?


Tagged , , ,

Good math days

I have good math days and bad math days; we all do.  An outsider might think the good math days are the days when you have good ideas.  That’s not how it works, at least for me.  You have good ideas on the bad math days, too; but one at a time.  You have an idea, you try it, you make some progress, it doesn’t work, your mind says “too bad.”

On the good math days, you have an idea, you try it, it doesn’t work, you click over to the next idea, you get over the obstacle that was blocking you, then you’re stuck again, you ask your mind “What’s the next thing to do?” you get the next idea, you take another step, and you just keep going.

You don’t feel smarter on the good math days.  It’s not even momentum, exactly, because it’s not a feeling of speed.  More like:  the feeling of being a big, heavy, not very fast vehicle, with very large tires, that’s just going to keep on traveling, over a bump, across a ditch, through a river, continually and inexorably moving in a roughly fixed direction.



When random people give money to random other people

A post on Decision Science about a problem of Uri Wilensky‘s has been making the rounds:

Imagine a room full of 100 people with 100 dollars each. With every tick of the clock, every person with money gives a dollar to one randomly chosen other person. After some time progresses, how will the money be distributed?

People often expect the distribution to be close to uniform.  But this isn’t right; the simulations in the post show clearly that inequality of wealth rapidly appears and then persists (though each individual person bobs up and down from rich to poor.)  What’s going on?  Why would this utterly fair and random process generate winners and losers?

Here’s one way to think about it.  The possible states of the system are the sets of nonnegative integers (m_1, .. m_100) summing to 10,000; if you like, the lattice points inside a simplex.  (From now on, let’s write N for 100 because who cares if it’s 100?)

The process is a random walk on a graph G, whose vertices are these states and where two vertices are connected if you can get from one to the other by taking a dollar from one person and giving it to another.  We are asking:  when you run the random walk for a long time, where are you on this graph?  Well, we know what the stationary distribution for random walk on an undirected graph is; it gives each vertex a probability proportional to its degree.  On a regular graph, you get uniform distribution.

Our state graph G isn’t regular, but it almost is; most nodes have degree N, where by “most” I mean “about 1-1/e”; since the number of states is

N^2 + N - 1 \choose N-1

and, of these, the ones with degree N are exactly those in which nobody’s out of money; if each person has a dollar, the number of ways to distribute the remaining N^2 – N dollars is

N^2  - 1 \choose N-1

and so the proportion of states where someone’s out of money is about

\frac{(N^2 - 1)^N}{(N^2 + N - 1)^N} \sim (1-1/N)^N \sim 1/e.

So, apart from those states where somebody’s broke, in the long run every possible state is equally likely;  we are just as likely to see $9,901 in one person’s hands and everybody else with $1 as we are to see exact equidistribution again.

What is a random lattice point in this simplex like?  Good question!  An argument just like the one above shows that the probability nobody goes below $c is on order e^-c, at least when c is small relative to N; in other words, it’s highly likely that somebody’s very nearly out of money.

If X is the maximal amount of money held by any player, what’s the distribution of X?  I didn’t immediately see how to figure this out.  You might consider the continuous version, where you pick a point at random from the real simplex

(x_1, .. x_N) \in \mathbf{R}^N:   \sum x_i = N^2.

Equivalently; break a stick at N-1 randomly chosen points; what is the length of the longest piece?  This is a well-studied problem; the mean size of the longest piece is about N log N.  So I guess I think maybe that’s the expected value of the net worth of the richest player?

But it’s not obvious to me whether you can safely approximate the finite problem by its continuous limit (which corresponds to the case where we keep the number of players at N but reduce the step size so that each player can give each other a cent, or a picocent, or whatever.)

What happens if you give each of the N players just one dollar?  Now the uniformity really breaks down, because it’s incredibly unlikely that nobody’s broke.  The probability distribution on the set of (m_1, .. m_N) summing to N assigns each vector a probability proportional to the size of its support (i.e. the number of m_i that are nonzero.)  That must be a well-known distribution, right?  What does the corresponding distribution on partitions of N look like?

Update:  Kenny Easwaran points out that this is basically the same computation physicists do when they compute the Boltzmann distribution, which was new to me.



Tagged , ,

Rational points on solvable curves over Q via non-abelian Chabauty (with Daniel Hast)

New paper up!  With my Ph.D. student Daniel Hast (last seen on the blog here.)

We prove that hyperelliptic curves over Q of genus at least 2 have only finitely many rational points.  Actually, we prove this for a more general class of high-genus curves over Q, including all solvable covers of P^1.

But wait, don’t we already know that, by Faltings?  Of course we do.  So the point of the paper is to show that you can get this finiteness in a different way, via the non-abelian Chabauty method pioneered by Kim.  And I think it seems possible in principle to get Faltings for all curves over Q this way; though I don’t know how to do it.

Continue reading

Tagged , , , , , , ,

Multiple height zeta functions?

Idle speculation ensues.

Let X be a projective variety over a global field K, which is Fano — that is, its anticanonical bundle is ample.  Then we expect, and in lots of cases know, that X has lots of rational points over K.  We can put these points together into a height zeta function

\zeta_X(s) = \sum_{x \in X(K)} H(x)^{-s}

where H(x) is the height of x with respect to the given projective embedding.  The height zeta function organizes information about the distribution of the rational points of X, and which in favorable circumstances (e.g. if X is a homogeneous space) has the handsome analytic properties we have come to expect from something called a zeta function.  (Nice survey by Chambert-Loir.)

What if X is a variety with two (or more) natural ample line bundles, e.g. a variety that sits inside P^m x P^n?  Then there are two natural height functions H_1 and H_2 on X(K), and we can form a “multiple height zeta function”

\zeta_X(s,t) = \sum_{x \in X(K)} H_1(x)^{-s} H_2(x)^{-t}

There is a whole story of “multiple Dirichlet series” which studies functions like

\sum_{m,n} (\frac{m}{n}) m^{-s} n^{-t}

where (\frac{m}{n}) denotes the Legendre symbol.  These often have interesting analytic properties that you wouldn’t see if you fixed one variable and let the other move; for instance, they sometimes have finite groups of functional equations that commingle the s and the t!

So I just wonder:  are there situations where the multiple height zeta function is an “analytically interesting” multiple Dirichlet series?

Here’s a case to consider:  what if X is the subvariety of P^2 x P^2 cut out by the equation

x_0 y_0 + x_1 y_1 + x_2 y_2 = 0?

This has something to do with Eisenstein series on GL_3 but I am a bit confused about what exactly to say.

Tagged , , , ,

What is the median length of homeownership?

Well, it’s longer than it used to be, per Conor Dougherty in the New York Times:

The median length of time people have owned their homes rose to 8.7 years in 2016, more than double what it had been 10 years earlier.

The accompanying chart shows that “median length of homeownership” used to hover at  just under 4 years.  That startled me!  Doesn’t 4 years seem like a pretty short length of time to own a house?

When I thought about this a little more, I realized I had no idea what this meant.  What is the “median length of homeownership” in 2017?  Does it mean you go around asking each owner-occupant how long they’ve lived in their house, and take the median of those numbers?  Probably not:  when people were asked that in 2008, the median answer was 10 years, and whatever the Times was measuring was about 3.7 years in 2008.

Does it mean you look at all house sales in 2017, subtract the time since last sale, and take the median of those numbers?

Suppose half of all houses changed hands every year, and the other half changed hands every thirty years.  Are the lengths of ownership we’re medianning half “one year” and half “30 years”, or “30/31 1 year” and 1/31 “30 years”?

There are about 75 million owner-occupied housing units in the US and 4-6 million homes sold per year, so the mean number of sales per unit per year is certainly way less than 1/4; of course, there’s no reason this mean should be close to the median of, well, whatever we’re taking the median of.

Basically I have no idea what’s being measured.  The Times doesn’t link to the Moody’s Analytics study it’s citing, and Dougherty says that study’s not public.  I did some Googling for “median length of homeownership” and as far as I can tell this isn’t a standard term of art with a consensus definition.

As papers run more data-heavy pieces I’d love to see a norm develop that there should be some way for the interested reader to figure out exactly what the numbers in the piece refer to.  Doesn’t even have to be in the main text.  Could be a linked sidebar.  I know not everybody cares about this stuff.  But I do!




Tagged , , ,

Fox-Neuwirth-Fuks cells, quantum shuffle algebras, and Malle’s conjecture for function fields

I’ve gotten behind on blogging about preprints!  Let me tell you about a new one I’m really happy with, joint with TriThang Tran and Craig Westerland, which we posted a few months ago.

Malle’s conjecture concerns the number of number fields with fixed Galois group and bounded discriminant, a question I’ve been interested in for many years now.  We recall how it goes.

Let K be a global field — that is, a number field or the function field of a curve over a finite field.  Any degree-n extension L/K (here L could be a field or just an etale algebra over K — hold that thought) gives you a homomorphism from Gal(K) to S_n, whose image we call, in a slight abuse of notation, the Galois group of L/K.

Let G be a transitive subgroup of S_n, and let N(G,K,X) be the number of degree-n extensions of K whose Galois group is G and whose discriminant has norm at most X.  Every permutation g in G has an index, which is just n – the number of orbits of g.  So the permutations of index 1 are the transpositions, those of index 2 are the three-cycles and the double-flips, etc.  We denote by a(G) the reciprocal of the minimal index of any element of G.  In particular, a(G) is at most 1, and is equal to 1 if and only if G contains a transposition.

(Wait, doesn’t a transitive subgroup of S_n with a transposition have to be the whole group?  No, that’s only for primitive permutation groups.  D_4 is a thing!)

Malle’s conjecture says that, for every \epsilon > 0, there are constants c,c_\epsilon such that

c X^{a(G)} < N(G,K,X) < c_\epsilon X^{a(G)+\epsilon}

We don’t know much about this.  It’s easy for G = S_2.  A theorem of Davenport-Heilbronn (K=Q) and Datskovsky-Wright (general case) proves it for G = S_3.  Results of Bhargava handle S_4 and S_5, Wright proved it for abelian G.  I kind of think this new theorem of Alex Smith implies for K=Q and every dihedral G of 2-power order?  Anyway:  we don’t know much.  S_6?  No idea.  The best upper bounds for general n are still the ones I proved with Venkatesh a long time ago, and are very much weaker than what Malle predicts.

Malle’s conjecture fans will point out that this is only the weak form of Malle’s conjecture; the strong form doesn’t settle for an unspecified X^\epsilon, but specifies an asymptotic X^a (log X)^b.   This conjecture has the slight defect that it’s wrong sometimes; my student Seyfi Turkelli wrote a nice paper which I think resolves this problem, but the revised version of the conjecture is a bit messy to state.

Anyway, here’s the new theorem:

Theorem (E-Tran-Westerland):  Let G be a transitive subgroup of S_n.  Then for all q sufficiently large relative to G, there is a constant c_\epsilon such that

N(G,\mathbf{F}_q(t),X) < c_\epsilon X^{a(G)+\epsilon}

for all X>0.

In other words:

The upper bound in the weak Malle conjecture is true for rational function fields.

A few comments.

  1.  We are still trying to fix the mistake in our 2012 paper about stable cohomology of Hurwitz spaces.  Craig and I discussed what seemed like a promising strategy for this in the summer of 2015.  It didn’t work.  That result is still unproved.  But the strategy developed into this paper, which proves a different and in some respects stronger theorem!  So … keep trying to fix your mistakes, I guess?  There might be payoffs you don’t expect.
  2. We can actually bound that X^\epsilon is actually a power of log, but not the one predicted by Malle.
  3. Is there any chance of getting the strong Malle conjecture?  No, and I’ll explain why.  Consider the case G=S_4.  Then a(G) = 1, and in this case the strong Malle’s conjecture predicts N(S_4,K,X) is on order X, not just X^{1+eps}.   But our method doesn’t really distinguish between quartic fields and other kinds of quartic etale algebras.  So it’s going to count all algebras L_1 x L_2, where L_1 and L_2 are quadratic fields with discriminants X_1 and X_2 respectively, with X_1 X_2 < X.  We already know there’s approximately one quadratic field per discriminant, on average, so the number of such algebras is about the number of pairs (X_1, X_2) with X_1 X_2 < X, which is about X log X.  So there’s no way around it:  our method is only going to touch weak Malle.  Note, by the way, that for quartic extensions, the strong Malle conjecture was proved by Bhargava, and he observes the same phenomenon:

    …inherent in the zeta function is a sum over all etale extensions” of Q, including the “reducible” extensions that correspond to direct sums of quadratic extensions. These reducible quartic extensions far outnumber the irreducible ones; indeed, the number of reducible quartic extensions of absolute discriminant at most X is asymptotic to X log X, while we show that the number of quartic field extensions of absolute discriminant at most X is only O(X).

  4.  I think there is, on the other hand, a chance of getting rid of the “q sufficiently large relative to G” condition and proving something for a fixed F_q(t) and all finite groups G.


OK, so how did we prove this?

Continue reading

Tagged , , , , , ,
%d bloggers like this: