There are two events X and Y whose probability you’d like to estimate. So you ask a hundred trusted, reasonable people what they think. Half of them say that the probability of X and the probability of Y are both 90%, and the probability of both X and Y occurring is 81%. The other half say that P(X) = 10%, P(Y) = 10%, and P(X and Y) = 1%.

What is your best estimate of P(X), P(Y), and P(X and Y)?

If you said “50%, 50%, 41%,” does it bother you that you deem these events not to be independent, even though every single person you polled said the opposite? If not, what did you say?

(The subtext of this post is: is the “Independence of Irrelevant Alternatives” axiom in Arrow’s theorem a good idea? Feel free to discuss that too.)

### Like this:

Like Loading...

*Related*

Taking the geometric mean instead preserves independence, and without further information about the events in question and why at least half the trusted and reasonable people people are so far off, I see no reason it should be worse (or better) than the arithmetic mean.

What if a single person estimated P(X) = P(Y) = 0?

I guess I’d have to remove that person from my “trusted, reasonable” list!

More seriously though, if one were trying to average estimates of a probability and the estimates really varied very widely (maybe the question is something like “is there other intelligent life in the galaxy?”), e.g. some say 90%, some 10%, some 1%, some 0.1%, etc., then averaging the logarithms doesn’t seem like such an outrageous thing to do.

(Of course, I suppose if some of the estimates looked more like 90%, 99%, 99.9%, etc., this approach would be far less meaningful, and if one were to do something like take the average of log p/(1-p), independence would no longer be preserved.)

It bothers me that a hundred trusted, reasonable people could disagree so thoroughly about something! Something is probably wrong with their model of the events, and something might also be wrong with how I judge trustworthiness and reasonableness.

Why should it bother me? If you want to measure whether people think X and Y are independent, take the expected covariance, or whatever statistic statisticians use to measure that. Or are you interested in knowing if X and Y are actually independent? Then why are you asking these clueless people?

Relevant papers:

Consensus of Subjective Probabilities: The Pari-Mutuel Method (1959)

The Consensus of Subjective Probability Distributions (1968)

From the information available, I think the most accurate model would be a

conditionallyindependent one involving a hidden parameter p, which in your case would either be 0.9 or 0.1. Once one conditions on p, then X and Y occur independently with probability p. The problem is that one does not know what p is and so has to make a model in which p itself is probabilistic, which makes X and Y to no longer be globally independent.In any case, the fact that one’s model lacks features that the “real” world is supposed to have is a familiar situation to be in, and not one that should be particularly unsettling.

What if you said “50%, 50%, and 25%”?

But why would you say this? You’re using a different method for estimating P(X) and P(Y) then you are for estimating P(X and Y), and it’s not clear to me how you would modify this idea if the independence issue disappeared or even was only perturbed slightly, say to P(X and Y) = 80% and P(X and Y) = 2% respectively.

I’d just like to add my two cents that IIA is, in fact, way overrated in Arrow’s theorem. If I tell you I prefer Romney over Trump among possible Republican presidential hopefuls, isn’t that far less information than if I tell you I prefer Romney over Huckabee over Bachmann over Palin over Trump? Those other alternatives AREN’T irrelevant because they give me a chance to tell you just how much I dislike Trump.

I agree with Matt, partially because of examples like the one in the post, and will try to write another post about this in the future.

[…] few months ago I posted a puzzle about aggregating probability estimates from different sources, and in particular how to aggregate opinions about the independence of two […]