Category Archives: math

The Coin Game, II

Good answers to the last question! I think I perhaps put my thumb on the scale too much by naming a variable p.

Let me try another version in the form of a dialogue.

ME: Hey in that other room somebody flipped a fair coin. What would you say is the probability that it fell heads?

YOU: I would say it is 1/2.

ME: Now I’m going to give you some more information about the coin. A confederate of mine made a prediction about whether the coin would fall head or tails and he was correct. Now what would you say is the probability that it fell heads?

YOU: Now I have no idea, because I have no information about the propensity of your confederate to predict heads.

(Update: What if what you knew about the coin in advance was that it fell heads 99.99% of the time? Would you still be at ease saying you end up with no knowledge at all about the probability that the coin fell heads?) This is in fact what Joyce thinks you should say. White disagrees. But I think they both agree that it feels weird to say this, whether or not it’s correct.

Why would it not feel weird? I think Qiaochu’s comment in the previous thread gives a clue. He writes:

Re: the update, no, I don’t think that’s strange. You gave me some weird information and I conditioned on it. Conditioning on things changes my subjective probabilities, and conditioning on weird things changes my subjective probabilities in weird ways.

In other words, he takes it for granted that what you are supposed to do is condition on new information. Which is obviously what you should do in any context where you’re dealing with mathematical probability satisfying the usual axioms. Are we in such a context here? I certainly don’t mean “you have no information about Coin 2” to mean “Coin 2 falls heads with probability p where p is drawn from the uniform distribution (or Jeffreys, or any other specified distribution, thanks Ben W.) on [0,1]” — if I meant that, there could be no controversy!

I think as mathematicians we are very used to thinking that probability as we know it is what we mean when we talk about uncertainty. Or, to the extent we think we’re talking about something other than probability, we are wrong to think so. Lots of philosophers take this view. I’m not sure it’s wrong. But I’m also not sure it’s right. And whether it’s wrong or right, I think it’s kind of weird.

Tagged ,

The coin game

Here is a puzzling example due to Roger White.

There are two coins.  Coin 1 you know is fair.  Coin 2 you know nothing about; it falls heads with some probability p, but you have no information about what p is.

Both coins are flipped by an experimenter in another room, who tells you that the two coins agreed (i.e. both were heads or both tails.)

What do you now know about Pr(Coin 1 landed heads) and Pr(Coin 2 landed heads)?

(Note:  as is usual in analytic philosophy, whether or not this is puzzling is itself somewhat controversial, but I think it’s puzzling!)

Update: Lots of people seem to not find this at all puzzling, so let me add this. If your answer is “I know nothing about the probability that coin 1 landed heads, it’s some unknown quantity p that agrees with the unknown parameter governing coin 2,” you should ask yourself: is it strange that someone flipped a fair coin in another room and you don’t know what the probability is that it landed heads?”

Relevant readings: section 3.1 of the Stanford Encyclopedia of Philosophy article on imprecise probabilities and Joyce’s paper on imprecise credences, pp.13-14.

Tagged , ,

Pila on a “modular Fermat equation”

I like this paper by Pila that just went up on the arXiv, which shows the way that you can get Diophantine consequences from the rapid progress being made in theorems of Andre-Oort type.  (I also want to blog about Tsimerman + Zhang + Yuan on “average Colmez” and Andre-Oort, maybe later!)

Pila shows that if N and M are sufficiently large primes, you can’t have elliptic curves E_1/Q and E_2/Q such that E_1 has an N-isogenous curve E_1 -> E’_1, E_2 has an M-isogenous curve E_2 -> E’_2, and j(E’_1) + j(E’_2) = 1.  (It seems to me the proof uses little about this particular algebraic relation and would work just as well for any f(j(E’_1),j(E’_2)) whose vanishing didn’t cut out a modular curve in X(1) x X(1).)  (This is “Fermat-like” in that it asserts finiteness of rational points on a natural countable family of high-genus curves; a more precise analogy is explained in the paper.)

How this works, loosely:  suppose you have such an (E_1, E_2).  A theorem of Kühne guarantees that E_1 and E_2 are not both CM (I didn’t know this!) It follows (WLOG assume N > M) that the N-isogenies of E_1 are defined over a field of degree at least N^a for some small a (Pila uses more precise bounds coming from a recent paper of Najman.)  So the Galois conjugates of (E’_1, E’_2) give you a whole bunch of algebraic points (E”_1, E”_2) with j(E”_1) + j(E”_2) = 1.

So what?  Rational curves have lots of low-height algebraic points.  But here’s the thing.  These isogenous choices of (E’_1, E’_2) aren’t just any algebraic points on X(1) x X(1); they represent pairs of elliptic curves drawn from a {\em fixed pair of isogeny classes}.  Let H be the hyperbolic plane as usual, and write (z,w) for a point on H x H corresponding to (E’_1, E’_2).  Then the other choices (E”_1, E”_2) correspond to points (gz,hw) with g,h in GL(Q).  GL(Q), not GL(R)!  That’s what we get from working in a fixed isogeny class.  And these points satisfy

j(gz) + j(hw) = 1.

To sum up:  you have a whole bunch of rational points (g,h) on GL_2 x GL_2.  These points are pretty low height (for this Pila gestures at a paper of his with Habegger.)  And they lie on the surface j(gz) + j(hw) = 1.  But this surface is a totally non-algebraic thing, because remember, j is a transcendental function on H!  So (Pila’s version of) the Ax-Lindemann theorem (correction from comments:  the Pila-Wilkie theorem) generates a contradiction; a transcendental curve can’t have too many low-height rational points.

Tagged , , , , ,

Configuration spaces of manifolds with flows (with John Wiltshire-Gordon)

New preprint up on the arXiv:  “Algebraic structures on cohomology of configuration spaces of manifolds with flows,” a short paper joint with John Wiltshire-Gordon.

John is a student at Michigan, finishing his Ph.D. this year under David Speyer, and he’s been thinking about stuff related to FI-modules ever since his undergrad days at Chicago hanging out with Benson Farb.

But this paper isn’t actually about FI-modules!  Let me explain.  Here’s the motivating question.  When M is a manifold, and S a finite set, we denote by PConf^S M the pure configuration space of M, i.e. the space of injections from S to M.  If S is the set 1,…,n we write PConf^n M for short.

Question:  Let M be a manifold.  What natural algebraic structure is carried by the cohomology groups H^i(PConf^n M,Z)?

Here’s one structure.  If f: S \rightarrow T is an injection, composition yields a map from PConf^T M to PConf^S M, which i turn yields a map from H^i(PConf^S M, Z) to  H^i(PConf^T M, Z).  In other words,

H^i(\mbox{PConf}^\bullet M, \mathbf{Z})

is a functor from the category of finite sets with injections to the category of k-vector spaces.  Such a functor is called an FI-module over k.  A big chunk of my paper with Benson Farb and Tom Church is devoted to figuring out what consequences this structure has for the Betti numbers, and it was by these means that Tom first proved that the unordered configuration spaces have stable cohomology with rational coefficients.  (This is actually false with integral coefficients, or when the coefficient field has characteristic p, but see the beautiful theorem of Rohit Nagpal for the story about what happens in the latter case.  How have I not blogged about that already?)

So it turns out that H_i(PConf M) is a finitely generated FI-module (the definition is what you expect) and this implies that the Betti number h^i(PConf^n M) agrees with some polynomial P_i(n) for all sufficiently large n.  For example, H_1(PConf^n S^2) has dimension


for n >= 3, but not for n=0,1,2.

If you know a little more about the manifold, you can do better.  For instance, if M has a boundary component, the Betti number agrees with P_i(n) for all n.  Why?  Because there’s more algebraic structure.  You can map from PConf^T to PConf^S, above, by “forgetting” points, but you can also add points in some predetermined contractible neighborhood of the boundary.  The operation of sticking on a point * gives you a map from PConf^S to PConf^{S union *}.  (Careful, though — if you want these maps to compose nicely, you have to say all this a little more carefully, and you really only want to think of these maps as defined up to homotopy; perfectly safe as long as we’re only keeping track of the induced maps on H^i.)

We thought we had a pretty nice story:  closed manifolds have configuration spaces with eventually polynomial Betti numbers, manifolds with boundary have configuration spaces with polynomial Betti numbers on the nose.  But in practice, it seems that configuration spaces sometimes have more stability than our results guaranteed!  For instance, H_1(PConf^n S^3) has dimension


for all n>0.  And in fact EVERY Betti number of the pure configuration space of S^3 agrees with a polynomial P_i(n) for all n > 0; the results of CEF guarantee only that h^i agrees with a polynomial once n > i.

What’s going on?

In the new paper, John and I write about a different way to get “point-adding maps” on configuration space.  If your M has the good taste to have an everywhere non-vanishing vector field, you can take any one of your marked points x in M and “split it” into two points y and y’, each very near x along the flowline of the vector field, one on either side of x.  Now once again we can both add and subtract points, as in the case of open manifolds, and again this supplies the configuration spaces with a richer structure.  In fact (exercise!) H_i(PConf^n M) now carries an action of the category of noncommutative finite sets:  objects are finite sets, morphisms are set maps endowed with an ordering of each fiber.

And fortunately, John already knew a lot about the representation theory of this category and categories like it!  In particular, it follows almost immediately that, when M is a closed manifold with a vector field (like S^3) the Betti number h^i(PConf^n M) agrees with some polynomial P_i(n) for all n > 0.  (For fans of character polynomials, the character polynomial version of this holds too, for cohomology with rational coefficients.)

That’s the main idea, but there’s more stuff in the paper, including a very beautiful picture that John made which explains how to answer the question “what structure is carried by the cohomology of pure configuration space of M when M has k nonvanishing vector fields?”  The answer is FI for k=0, the category of noncommutative finite sets for k=1, and the usual category of finite sets for k > 1.

Tagged , , , ,

The adventures of Terry Tao in the 21st century

Great New York Times profile of Terry Tao by Gareth Cook, an old friend of mine from Boston Phoenix days.

I’ve got a quote in there:

‘‘Terry is what a great 21st-­century mathematician looks like,’’ Jordan Ellenberg, a mathematician at the University of Wisconsin, Madison, who has collaborated with Tao, told me. He is ‘‘part of a network, always communicating, always connecting what he is doing with what other people are doing.’’

I thought it would be good to say something about the context in which I told Gareth this.  I was explaining how happy I was he was profiling Terry, because Terry is at the same time extraordinary and quite typical as a mathematician.  Outlier stories, like those of Nash, and Perelman, and more recently Mochizuki, get a lot of space in the general press.  And they’re important stories.  But they’re stories because they’re so unrepresentative of the main stream of mathematical work.  Lone bearded men working in secret, pitched battles over correctness and priority, madness, etc.  Not a big part of our actual lives.

Terry’s story, on the other hand, is what new, deep, amazing math actually usually looks like.  Many minds cooperating, enabled by new technology.  Blogging, traveling, talking, sharing.  That’s the math world I know.  I’m happy as hell to see it in the New York Times.


Tagged ,

Alexandra Florea on the average central value of hyperelliptic L-functions

Alexandra Florea, a student of Soundararajan, has a nice new paper up, which I heard about in a talk by Michael Rubinstein.  She computes the average of

L(1/2, \chi_f)

as f ranges over squarefree polynomials of large degree.  If this were the value at 1 instead of the value at 1/2, this would be asking for the average number of points on the Jacobian of a hyperelliptic curve, and I could at least have some idea of where to start (probably with this paper of Erman and Wood.)  And I guess you could probably get a good grasp on moments by imitating Granville-Soundararajan?

But I came here to talk about Florea’s result.  What’s cool about it is that it has the a main term that matches existing conjectures in the number field case, but there is a second main term, whose size is about the cube root of the main term, before you get to fluctuations!

The only similar case I know is Roberts’ conjecture, now a theorem of Bhargava-Shankar-Tsimerman and Thorne-Taniguchi, which finds a similar secondary main term in the asymptotic for counting cubic fields.  And when I say similar I really mean similar — e.g. in both cases the coefficient of the secondary term is some messy thing involving zeta functions evaluated at third-integers.

My student Yongqiang Zhao found a lovely geometric interpretation for the secondary term the Roberts conjecture.  Is there some way to see what Florea’s secondary term “means” geometrically?  Of course I’m stymied here by the fact that I don’t really know how to think about her counting problem geometrically in the first place.


Tagged , , , ,

Cold Topics Workshop

I was in Berkeley the other day, chatting with David Eisenbud about an upcoming Hot Topics workshop at MSRI, and it made me wonder:  why don’t we have Cold Topics workshops?  In the sense of “cold cases.”  There are problems that the community has kind of drifted away from, because we don’t really know how to do them, but which are as authentically interesting as they ever were.  Maybe it would be good to programatically focus our attention on those cold topics from time to time, just to see whether the passage of time has given us any new ideas, or cast these cold old problems in a new and useful light.

If this idea catches on, we could even consider having an NSF center devoted to these problems.  The Institute for Unpopular Mathematics!

What cold topics workshops would you propose to me, the founding director of the IUM?

Tagged ,

Idle question: are Kakeya sets winning?

Jayadev Athreya was here last week and reminded me about this notion of “winning sets,” which I learned about from Howie Masur — originally, one of the many contributions of Wolfgang Schmidt.

Here’s a paper by Curt McMullen introducing a somewhat stronger notion, “absolute winning.”

Anyway:  a winning set (or an absolute winning set) in R^n is “big” in some sense.  In particular, it has to have full Hausdorff dimension, but it doesn’t have to have positive measure.

Kakeya sets (subsets of R^n containing a unit line segment in every direction) can have measure zero, by the Besicovitch construction, and are conjectured (when n=2, known) to have Hausdorff dimension n.  So should we expect these sets to be winning?  Are Besicovitch sets winning?

I have no reason to need to know.  I just think these refined classifications of sets which are measure 0 yet still “large” are very interesting.  And for all I know, maybe there are sets where the easiest way to prove they have full Hausdorff dimension is to prove they’re winning!



Tagged , , , , ,

Shin-Strenner: Pseudo-Anosov mapping classes not arising from Penner’s construction

Balazs Strenner, a Ph.D. student of Richard Kent graduating this year, gave a beautiful talk yesterday in our geometry/topology seminar about his recent paper with Hyunshik Shin.  (He’s at the Institute next year but if you’re looking for a postdoc after that…!)

A long time ago, Robert Penner showed how to produce a whole semigroup M in the mapping class group with the property that all but a specified finite list of elements of M were pseudo-Anosov.  So that’s a good cheap way to generate lots of certified pseudo-Anosovs in the mapping class group.  But of course one asks:  do you get all pA’s as part of some Penner semigroup?  This can’t quite be true, because it turns out that the Penner elements can’t permute singularities of the invariant folation, while arbitrary pA’s can.  But there are only finitely many singularities, so some power of a given pA clearly fixes the singularities.

So does every pA have a power that arises from Penner’s construction?  This is what’s known as Penner’s conjecture.  Or was, because Balazs and Hyunshik have shown that it is falsitty false false false.

When I heard the statement I assumed this was going to be some kind of nonconstructive counting argument — but no, they actually give a way of proving explicitly that a given pA is not in a Penner semigroup.  Here’s how.  Penner’s semigroup M is generated by Dehn twists Q_1, … Q_m, which all happen to preserve a common traintrack, so that there’s actually a representation

\rho: M \rightarrow GL_n(\mathbf{R})

such that the dilatation of g is the Perron-Frobenius eigenvalue \lambda of \rho(g).

Now here’s the key observation; there is a quadratic form F on R^n such that F(Q_i x) >= F(x) for all x, with equality only when x is a fixed point of Q_i.  In particular, this shows that if g is an element of M not of the form Q_i^a, and x is an arbitrary vector, then the sequence

x, g x, g^2 x, \ldots

can’t have a subsequence converging to x, since

F(x), F(gx), F(g^2 x), \ldots

is monotone increasing and thus can’t have a subsequence converging to F(x).

This implies in particular:

g cannot have any eigenvalues on the unit circle.

But now we win!  Because \rho(g) is an integral matrix, so all the Galois conjugates of \lambda must be among its eigenvalues.  In other words, \lambda is an algebraic number none of whose Galois conjugates lie on the unit circle.  But there are lots of pseudo-Anosovs whose dilatations \lambda do have Galois conjugates on the unit circle.  In fact, experiments by Dunfield and Tiozzo seem to show that in a random walk on the braid group, the vast majority of pAs have this property!  And these pAs, which Shin and Strenner call coronal, cannot appear in any Penner semigroup.


Anyway, I found the underlying real linear algebra question very appealing.  Two idle questions:

  • If M is a submonoid of GL_n(R) we may say a continuous real-valued function F on R^n is M-monotone if F(mx) >= F(x) for all m in M, x in R^n.  The existence of a monotone function for the Penner monoid is the key to Strenner and Shin’s theorem.  But I have little feeling for how it works in general.  Given a finite set of matrices, what are explicit conditions that guarantee the existence of an M-monotone function?  Nonexistence?  (I have a feeling it is roughly equivalent to M containing no element with an eigenvalue on the unit circle, but I’m not sure, and anyway, this is not a checkable condition on the generating matrices…)
  • What can we say about the eigenvalues of matrices appearing in the Penner subgroup?  Balazs says he’ll show in a later paper that they can actually get arbitrarily close to the unit circle, which is actually not what I had expected.  He asks:  are those eigenvalues actually dense in the complex plane?
Tagged , , , ,

What I learned at the Joint Math Meetings

Another Joint Meetings in the books!  My first time in San Antonio, until last weekend the largest US city I’d never been to.  (Next up:  Jacksonville.)  A few highlights:

  • Ngoc Tran, a postdoc at Austin, talked about zeroes of random tropical polynomials.  She’s proved that a random univariate tropical polynomial of degree n has about c log n roots; this is the tropical version of an old theorem of Kac, which says that a random real polynomial of degree n has about c log n real roots.  She raised interesting further questions, like:  what does the zero locus of a random tropical polynomial in more variables look like?  I wonder:  does it look anything like the zero set of a random band-limited function on the sphere, as discussed by Sarnak and Wigman?  If you take a random tropical polynomial in two variables, its zero set partitions the plane into polygons, which gives you a graph by adjacency:  what kind of random graph is this?
  • Speaking of random graphs, have you heard the good news about L^p graphons?  I missed the “limits of discrete structures” special session which had tons of talks about this, but I ran into the always awesome Henry Cohn, who gave me the 15-minute version.  Here’s the basic idea.  Large dense graphs can be modeled by graphons; you take a symmetric function W from [0,1]^2 to [0,1], and then your procedure for generating a random graph goes like this. Sample n points x_1,…x_n uniformly from [0,1] — these are your vertices.  Now put an edge between x_i and x_j with probability W(x_i,x_j) = W(x_j,x_i).  So if W is constant with value p, you get your usual Erdös-Renyi graphs, but if W varies some, you can get variants of E-R, like the much-beloved stochastic blockmodel graphs, that have some variation of edge density.  But not too much!  These graphon graphs are always going to have almost all vertices with degree linear in n.  That’s not at all like the networks you encounter in real life, which are typically sparse (vertex degrees growing sublinearly in n, or even being constant on average) and typically highly variable in degree (e.g. degrees following a power law, not living in a band of constant multiplicative width.)  The new theory of L^p graphons is vastly more general.  I’ve only looked at this paper for a half hour but I feel like it’s the answer to a question that’s always bugged me; what are the right descriptors for the kinds of random graphs that actually occur in nature?  Very excited about this, will read it more, and will give a SILO seminar about it on February 4, for those around Madison.
  • Wait, I’ve got still one more thing about random graphs!  Russ Lyons gave a plenary about his work with Angel and Kechris about unique ergodicity of the action of the automorphism group of the random graph.  Wait, the random graph? I thought there were lots of random graphs!  Nope — when you try to define the Erdös-Renyi graph on countably many vertices, there’s a certain graph (called “the Rado graph”) to which your random graph is isomorphic with probability 1!  What’s more, this is true — and it’s the same graph — no matter what p is, as long as it’s not 0 or 1!  That’s very weird, but proving it’s true is actually pretty easy.  I leave it an exercise.
  • Rick Kenyon gave a beautiful talk about his work with Aaron Abrams about “rectangulations” — decompositions of a rectangle into area-1 subrectangles.  Suppose you have a weighted directed graph, representing a circuit diagram, where the weights on the edges are the conductances of the corresponding wires.  It turns out that if you fix the energy along each edge (say, to 1) and an acyclic orientation of the edges, there’s a unique choice of edge conductances such that there exists a Dirichlet solution (i.e. an energy-minimizing assignment of a voltage to each node) with the given energies.  These are the fibers of a rational map defined over Q, so this actually gives you an object over a (totally real) algebraic number field for each acyclic orientaton.  As Rick pointed out, this smells a little bit like dessins d’enfants!  (Though I don’t see any direct relation.)  Back to rectangulations:  it turns out there’s a gadget called the “Smith Diagram” which takes a solution to the Dirichlet problem on the graph  and turns it into a rectangulation, where each edge corresponds to a rectangle, the area of the rectangle is the energy contributed by the current along that edge, the aspect ratio of the rectangle is the conductance, the bottom and top faces of the rectangle correspond to the source and target nodes, the height of a face is the voltage at that node, and etc.  Very cool!  Even cooler when you see the pictures.  For a 40×40 grid, it looks like this:


Tagged , , , , ,

Get every new post delivered to your Inbox.

Join 645 other followers

%d bloggers like this: