Tag Archives: probability

Bobrowski-Kahle-Skraba on the null hypothesis in persistent homology

I really like persistent homology; it’s a very beautiful idea, a way to look for structure in data when you really don’t have any principled way to embed it in Euclidean space (or, even when it does come embedded in Euclidean space, to find the kind of structure that doesn’t depend too much on the embedding.)

But because I like it, I want to see it done well, so I have some minor complaints!

Complaint one:  Persistent homology, applied to H_0 only, is clustering, and we know a lot about clustering already.  (Update:  As commenters point out, this is really only so for persistent homology computed on the Vietoris-Rips complex of a point cloud, the “classical case…”!)  Not to say that the ideas of persistence can’t be useful here at all (I have some ideas about directed graphs I want to eventually work out) but my sense is that people are not craving new clustering algorithms.  I really like the work that tries to grapple with the topology of the data in its fullness; I was really charmed, for instance, by Ezra Miller’s piece about the persistent homology of fruit fly wings.  (There’s a lot of nice stuff about geometric probability theory, too — e.g., how do you take the “average” of a bunch of graded modules for k[x,y], which you may think of as noisy measurements of some true module you want to estimate?)

My second complaint is the lack of understanding of the null hypothesis.  You have some point cloud, you make a barcode, you see some bars that look long, you say they’re features — but why are you so sure?  How long would bars be under the null hypothesis that the data has no topological structure at all?  You kind of have to know this in order to do good inference.  Laura Balzano and I did a little numerical investigation of this years ago but now Omer Bobrowski, Matthew Kahle, and Primoz Skraba have proved a theorem!  (Kahle’s cool work in probabilistic topology has appeared several times before on Quomodocumque…)

They show that if you sample points from a uniform Poisson process on the unit cube of intensity n (i.e. you expect n points) the longest bar in the H_k barcode has

(death radius / birth radius) ~ [(log n)/(log log n)]^(1/k).

That is really short!  And it makes me feel like there actually is something going on, when you see a long barcode in practice.

Tagged , , , ,

The Coin Game, II

Good answers to the last question! I think I perhaps put my thumb on the scale too much by naming a variable p.

Let me try another version in the form of a dialogue.

ME: Hey in that other room somebody flipped a fair coin. What would you say is the probability that it fell heads?

YOU: I would say it is 1/2.

ME: Now I’m going to give you some more information about the coin. A confederate of mine made a prediction about whether the coin would fall head or tails and he was correct. Now what would you say is the probability that it fell heads?

YOU: Now I have no idea, because I have no information about the propensity of your confederate to predict heads.

(Update: What if what you knew about the coin in advance was that it fell heads 99.99% of the time? Would you still be at ease saying you end up with no knowledge at all about the probability that the coin fell heads?) This is in fact what Joyce thinks you should say. White disagrees. But I think they both agree that it feels weird to say this, whether or not it’s correct.

Why would it not feel weird? I think Qiaochu’s comment in the previous thread gives a clue. He writes:

Re: the update, no, I don’t think that’s strange. You gave me some weird information and I conditioned on it. Conditioning on things changes my subjective probabilities, and conditioning on weird things changes my subjective probabilities in weird ways.

In other words, he takes it for granted that what you are supposed to do is condition on new information. Which is obviously what you should do in any context where you’re dealing with mathematical probability satisfying the usual axioms. Are we in such a context here? I certainly don’t mean “you have no information about Coin 2” to mean “Coin 2 falls heads with probability p where p is drawn from the uniform distribution (or Jeffreys, or any other specified distribution, thanks Ben W.) on [0,1]” — if I meant that, there could be no controversy!

I think as mathematicians we are very used to thinking that probability as we know it is what we mean when we talk about uncertainty. Or, to the extent we think we’re talking about something other than probability, we are wrong to think so. Lots of philosophers take this view. I’m not sure it’s wrong. But I’m also not sure it’s right. And whether it’s wrong or right, I think it’s kind of weird.

Tagged ,

The coin game

Here is a puzzling example due to Roger White.

There are two coins.  Coin 1 you know is fair.  Coin 2 you know nothing about; it falls heads with some probability p, but you have no information about what p is.

Both coins are flipped by an experimenter in another room, who tells you that the two coins agreed (i.e. both were heads or both tails.)

What do you now know about Pr(Coin 1 landed heads) and Pr(Coin 2 landed heads)?

(Note:  as is usual in analytic philosophy, whether or not this is puzzling is itself somewhat controversial, but I think it’s puzzling!)

Update: Lots of people seem to not find this at all puzzling, so let me add this. If your answer is “I know nothing about the probability that coin 1 landed heads, it’s some unknown quantity p that agrees with the unknown parameter governing coin 2,” you should ask yourself: is it strange that someone flipped a fair coin in another room and you don’t know what the probability is that it landed heads?”

Relevant readings: section 3.1 of the Stanford Encyclopedia of Philosophy article on imprecise probabilities and Joyce’s paper on imprecise credences, pp.13-14.

Tagged , ,

Rolling the dice on Iran

David Sanger in today’s NYT on the Iran deal:

Mr. Obama will be long out of office before any reasonable assessment can be made as to whether that roll of the dice paid off.

Which is true!  But something else that’s true: not having a deal would also be a roll of the dice.  We’re naturally biased to think of the status quo as the safest course.  But why?  There’s no course of political action that leads to a certain outcome.  We’re rolling the dice no matter what; all we get to do is choose which dice.

Tagged , ,

My other daughter is a girl

I like Cathy’s take on this famous probability puzzle.  Why does this problem give one’s intuition such a vicious noogie?

It is relevant that the two questions below have two different answers.

  • I have two children.  One of my children is a girl who was born on Friday.  What’s the probability I have two girls?
  • I have two children.  One of my children is a girl.  Before you came in, I selected a daughter at random from the set of all my daughters, and this daughter was born on Friday.  What’s the probability I have two girls?

 

Tagged , , ,

The stochastic sandpile

At last week’s AIM conference on chip-firing I learned about a cool stochastic process which seems very simple, but whose behavior is apparently very hard to prove anything about!

It goes like this.  You start with a graph (say a finite connected graph, for simplicity) and your states are functions from the vertices of the graph to the natural numbers, which you think of as specifying the height of a pile of chips (or sand grains, if you want to keep up that metaphor) sitting on top of that vertex.  At least one of the vertices is a sink, which means that any chip landing there disappears forever.

Now the nature of the process is that you never want to have more than one chip at a vertex.  So every time there are at least two chips at a vertex v, you can pop two of them off; each chip flips a coin and decides whether to topple left or right.  And you keep on doing this until there’s at most one chip at each spot.  (You might start with more chips than vertices, but the sink helps you slough off the surplus.)  This process was apparently first studied by Manna and is sometimes called the Manna sandpile.

More math below the fold!

Continue reading

Tagged , , , , , , ,

Math on Trial, by Leila Schneps and Coralie Colmez

The arithmetic geometer Leila Schneps, who taught me most of what I know about Galois actions on fundamental groups of varieties, has a new book out, Math on Trial:  How Numbers Get Used and Abused in the Courtroom, written with her daughter Coralie Colmez.  Each chapter covers a famous case whose resolution, for better or worse, involved a mathematical argument.  Interspersed among the homicide and vice are short chapters that speak directly to some of the mathematical and statistical issues that arise in legal matters.  One of the cases is the much-publicized prosecution of college student Amanda Knox for a murder in Italy; today in the New York Times, Schneps and Colmez write about some of the mathematical pratfalls in their trial.

I am happy to have played some small part in building their book — I was the one who told Leila about the murder of Diana Sylvester, which turned into a whole chapter of Math on Trial; very satisfying to see the case treated with much more rigor, depth, and care than I gave it on the blog!  I hope it is not a spoiler to say that Schneps and Colmez come down on the side of assigning a probability close to 1 that the right man was convicted (though not nearly so close to 1 as the prosecution claimed, and perhaps to close enough for a jury to have rightfully convicted, depending on how you roll re “reasonable doubt.”)

Anyway — great book!  Buy, read, publicize!

 

 

Tagged , , , , , ,

Obama 5, Romney 3 in the 7th

Lots of people are following Nate Silver’s election tracking over at 538, especially his top-line estimate of the probability that Barack Obama will be re-elected in November.  Silver has that number at 79.7% today.  Sounds like good news for Obama.  But it’s hard to get a gut feeling for what that number means.  Yeah, it means Obama has a 4 in 5 chance of winning — but since the election isn’t going to happen 5 times, that proportion doesn’t quite engage the intuition.

Here’s one trick I thought of, which ought to work for baseball fans.  The Win Probability Inquirer over at Hardball Times will estimate the probability of a baseball team winning a game under any specified set of conditions.  Visiting team down by 4 in the 2nd, but has runners on 2nd and 3rd with nobody out?  They’ve got a 26% chance of winning.  Next batter strikes out?  Their chances go down to 22%.

So when do you have a 79.7% of winning?  If we consider the Obama-Romney race to have started in April or May, when Romney wrapped up the nomination, we’re about 2/3 of the way through — so let’s the 7th inning.  If the visiting team is ahead by 2 runs going into the 7th, they’ve got an 82% chance of winning.  That’s pretty close.  If you feel the need to tweak the knobs, say the first two batters of the inning fail to reach; with two outs in the top of the 7th, bases empty and a 2-run lead, the visitors win 79.26% of the time, just a half-percent off from Silver’s estimate.

So:  Obama 5, Romney 3, top of the 7th.  How certain do you feel that Obama wins?

Update:  (request from the comments)  Silver currently has Obama with an 85.% chance of winning.  That’s like:  home team up 5-3, visitors batting in the top of the 8th, runner on first with one out.

 

Tagged , , , , ,

Why are the Orioles performing over their Pythagorean record?

This graph from Camden Depot showing the Orioles’ distribution of runs scored and runs against got me thinking:

 

 

Weird, right?  The Orioles allow very few runs frequently, and a lot of runs frequently, but don’t allow runs in that 3-4-5 range very often.  Their distribution of runs scored shows a much more ordinary shape.

Could this be explaining the Orioles consistent ability to win games despite allowing more runs than they score?  It’s not out of the question.  Imagine a team that allowed 0 runs 40% of the time and 5 runs 60% of the time; that team would allow 3 runs a game on average.  Suppose they scored exactly 3 runs every single game.  Then they’d score exactly as much as they allowed, so their Pythagorean WP would be .500.  But in fact they’d be a .400 team.

So I checked this for the Orioles — if each game had an RS and RA drawn at random from the distribution above (I got the exact numbers from baseball-reference, actually) it turns out that you’d get a winning percentage of .479, which gives 58 or 59 wins out of their current 122 games played.  Their Pythagorean WP is .456, which predicts 55 or 56.

So the Orioles’ weirdly bimodal RA distribution is indeed helping them beat Pythagoras, but only by 3 games or so; why they’re 10 over Pythagoras right now remains a mystery, and is probably just some combination of great bullpen and great luck.

Tagged , ,

More on probability aggregation and De Finetti

A few months ago I posted a puzzle about aggregating probability estimates from different sources, and in particular how to aggregate opinions about the independence of two events.

I think I now understand the story slightly better.  I am essentially going to agree with what Terry T. said in the comments to the first post (this is my surprised face) but at the same time try to dissolve my initial resistance to talking about second-order probabilities (statements of the form “the probability that the probability is p is q….”)

To save you a click, the question amounts to:  if half of your advisors tell you that X and Y are independent coins with probability .9 of landing heads, and the other half of your advisors agree the coins are independent but say that the probability of heads is .1 for each, what should your degree of belief in X, Y, and X&Y be?  And should you believe that X and Y are independent events, a fact about which your advisors are unanimous?

The answer depends, at least in part, on what you mean by “probability” and “independence.”

On one account, probability is a number between 0 and 1 that represents your degree of belief in a hypothesis, and independence of X and Y means that Pr(X&Y) = Pr(X)Pr(Y).  Both are assertions about your mental state.  So there’s no reason that the unanimity of your advisors about the independence of X and Y should make you believe that X and Y are independent; why should this aspect of their mental state automatically be taken to be a guide to yours?  Relevant comparison:  what if each advisor said “I am really sure my belief about the coin is correct.”  Since all your advisors agree that the nature of the coin is very strongly certain, should you agree about that too?  No — given that half your advisors think the coin is very likely to fall heads and half that it is very likely to fall heads, you are reasonably pretty unsure about the nature of the coin.  Moreover, if X falls heads, you should rationally increase your degree of belief that Y will fall heads too, because X falling heads is evidence that the 0.9 gang is correct in their beliefs.  So (for you, even if not for your advisors) the two events are not independent.

There is another account, in which the probability is an intrinsic property of the coin.  On this account, it makes sense to talk about second-order probabilities:  to say, for instance, that the probability that “the probability that the coin falls heads is .9” is 1/2.  On this account, we can talk (as Terry does) about conditional independence; we say that there is an unknown parameter p which measures the propensity of the coin to fall heads, and that the condition Pr(X&Y) = P(X)P(Y) for independence only makes sense once P(X) and P(Y) are known.

In fact, I’ve come to favor the second view, at least as regards coins.  Because here’s the thing.  Let’s say I start with the first view.  I have in mind a degree of belief that the first coin will fall heads, and I call this P(X).  Given the evidence I have, probably P(X) should be 0.5.  But once I’m forming degrees of belief, I must also have a degree of belief that a sequence of k tosses of the coin will all fall heads.  And this should be  the average of (0.9)^k and (0.1)^k, not (0.5)^k!

Having in mind the probability distributions on “number of heads in k tosses” for all k is, by De Finetti’s theorem, more or less the same as having in mind a probability distribution on the propensity of the coin to fall heads.  That is, if a binary event is one we can imagine repeating, then our subjective degrees of belief about the event automatically have the structure of a second-order probability distribution on (Bernoulli) probability distributions.  In fact, I think this was why De Finetti proved De Finetti’s theorem.  In this context, independence is an intrinsic fact about the coins, not about our knowledge, and we should agree with our advisors that the coins are independent.

I’m less sure this story applies to uncertain events which are, by their nature, unrepeatable.  What do we mean when we talk about the probability that Ankylosaurus had feathers?  Is it meaningful in this context to say “I think there’s a 50% chance that there’s a 90% chance Ankylosaurus had feathers, and a 50% chance that there’s only a 10% chance” or is this exactly the same as saying you think there’s a 50% chance?

Tagged , , ,
%d bloggers like this: