## The Google puzzle and the perils of averaging ratios

The following brain-teaser has been going around, identified as a question from a Google interview (though there’s some controversy about whether Google actually uses questions like this.)

There’s a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy; then they stop. What fraction of the population is female?

Steve Landsburg posted a version of this question on his blog.  “The answer they expect,” he writes, “is simple, definitive, and wrong… Are you smarter than the folks at Google?  What’s the answer?”

Things quickly went blooey.  Google’s purported answer — fiercely argued for by lots of Landsburg’s readers — is 1/2.  Landsburg said the right answer was less.  A huge comment thread and many follow-up posts ensued.  Lubos Motl took time out from his busy schedule of yelling at mathematicians about string theory to yell at Landsburg about probability theory.  Landsburg offered to bet Motl, or anybody else, \$15,000 that a computer simulation would demonstrate the correctness of his answer.

What’s going on here?  How could a simple probability question have stirred up such a ruckus?

Here’s Landsburg’s explanation of the question:

What fraction of the population should we expect to be female? That is, in a large number of similar countries, what would be the average proportion of females?

If G is the number of girls, and B the number of boys, Landsburg is asking for the expected value E(G/(G+B)).  And let’s get one thing straight:  Landsburg is absolutely right about this expected value.  For any finite number of families, it is strictly less than 1/2.  (See the related Math Overflow thread for a good explanation.)  Landsburg has very patiently knocked down the many wrong arguments to the contrary in his comments section.  Anybody who bets against him, on his terms, is going to lose.

Nonetheless, I’m about to explain why Landsburg is wrong.

You see, Google’s version of the question doesn’t specify anything about expectation.  They might just as well have meant:  “What is the proportion of the expected number of females in the expected population?”  Which is to say, “What is E(G)/E(G) + E(B)”?  And the answer to that question is 1/2.  Just to emphasize the subtlety involved here:

On average, the number of boys and the number of girls are the same.  Furthermore, the proportion of girls is, on average, less than 1/2.

Weird, right?  E(G)/E(G) + E(B) isn’t what Landsburg was asking for — but, if Google’s answer was 1/2, it’s presumably the question they had in mind.  To accuse them of getting their own question “wrong” is a bit rich.

But let me go all in — I actually think Landsburg’s interpretation of the question is not only different from Google’s, but in some ways inferior!  Because averaging ratios with widely ranging denominators is kind of a weird thing to do.  You can certainly compute the average population density of all the U.S. states — but should you? What meaning or use would the result have?

I had a really pungent example ready to deploy, which illustrates the perils of averaging ratios and explains why Landsburg’s version of the question was a little weird.  Then I went to the Joint Meetings before getting around to writing this post.  And when I got back, I discovered that Landsburg had posted the same example on his own blogin support of his point of view!  Awesome.  Here it is:

There’s a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy; then they stop. In expectation, what is the ratio of boys to girls?

The answer to this question is, of course, infinity; in a finite population there might be no girls, so B/G is infinite with some positive probability, so E(B/G) is infinite as well.

But the correctness of that answer surely tells us this is a terrible question!  Averaging is a terribly cruel thing to do to a bunch of ratios.  One zero denominator and you’ve wiped out your entire dataset.

What if Landsburg had phrased his new question along the lines of Google’s original puzzle?

There’s a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy; then they stop. What is the ratio of boys to girls in this country?

Honest question:  does Landsburg truly think that infinity is the only “right answer” to this question?  Does he think infinity is a good answer?  Would he hire a person who gave that answer?  Would you?