## How to find a syphilitic with the Lambert W-function and Fibonacci numbers

I said a few vague words on the All Things Considered piece about the Dorfman protocol for syphilis screening, which is an early example of the exploitation of sparsity to improve signal detection.  Jeffrey Shallit follows up with a really nice explanation of the method at his blog.  Good stuff in the comments, too, especially this souped-up recursive Dorfman suggested by Gareth McCaughan, in which Fibonacci numbers arise in a mysterious way!

## The Challenger disaster was not caused by Russian roulette

It doesn’t take much imagination to see how risk homeostasis applies to NASA and the space shuttle. In one frequently quoted phrase, Richard Feynman, the Nobel Prize- winning physicist who served on the Challenger commission, said that at NASA decision-making was “a kind of Russian roulette.” When the O-rings began to have problems and nothing happened, the agency began to believe that “the risk is no longer so high for the next flights,” Feynman said, and that “we can lower our standards a little bit because we got away with it last time.” But fixing the O-rings doesn’t mean that this kind of risk-taking stops. There are six whole volumes of shuttle components that are deemed by NASA to be as risky as O-rings. It is entirely possible that better O-rings just give NASA the confidence to play Russian roulette with something else.

If this is really what Feynman said, wasn’t he wrong?  In Russian roulette, you know there’s one bullet in the gun.  The chance of a catastrophe is just one in six the first time you put the gun to your head; but if you survive the first try, you know the round is in one of the remaining five chambers and the chance of death next time you pull the trigger climbs to 20%.  The longer you play, the more likely disaster becomes.

But what if you don’t know how many chambers are loaded?  Suppose you play “Bayes Roulette,” in which the number of bullets is equally likely to be anywhere from 1 to 6.  Then the chance of survival on the first try is (5/6) if there’s 1 bullet in the cylinder, (4/6) if 2 bullets, and so on, for a total of

(1/6)(5/6) + (1/6)(4/6) + … (1/6)(0/6) = 5/12

which is about 41%.  Pretty bad.  But let’s say you pull the trigger once and live.  Now by Bayes’ theorem, the chance that there’s 1 bullet in the cylinder is

Pr(1 bullet in cylinder and I survived the first try) / P(I survived the first try)

or

(5/36)/(5/12) = 1/3.

Similarly, the chance that there are 5 bullets in the cylinder is

(1/36)/(5/12) = 1/15.

And the chance that there were 6 bullets in the cylinder is 0, because if there had been, well, you would be a former Bayesian.

All in all, your chance of surviving the next shot is

(5/15)*(4/5) + (4/15)*(3/5) + (3/15)*(2/5) + (2/15)*(1/5) + (1/15)*0= 8/15.

In other words, once you survive the first try, you’re more likely, not less, to survive the next one; because you’ve increased the odds that the gun is mostly empty.

Or suppose the gun is either fully loaded or empty, but you don’t know which.  The first time you pull the trigger, you have no idea what your odds of death are.  But the second time, you know you’re completely safe.

I think the space shuttle are a lot more like Bayes Roulette than Russian Roulette.  You don’t know how likely an O-ring failure is to cause a crash, just as you don’t know how many bullets are in the gun.  And if the O-rings fail now and then, with no adverse consequences, you are in principle perfectly justified in worrying less about O-rings.  If you shoot yourself four times and no bullet comes out, you ought to be getting more confident the gun is empty.

## Robert Siegel said my name

I was on All Things Considered today, talking a bit about compressed sensing. They let me tell the syphillis story that didn’t fit in the Wired article. The radio broadcast has come and gone, but you can still hear it at NPR’s website. Laura Balzano made the audio demo; for more explanation and more cool demos, see her page.

Update: I just listened to the piece.  Sorry for the inaccurate title:  Art Silverman, not Robert Siegel, said my name.

Tagged , , ,

## 2010 food cart update

May is here and the carts are out!  Plenty of newcomers on Library Mall.  Athens is justifying its victory in the citywide food cart contest, serving first-rate gyros with french fries in the sandwich, and a solid spanikopita too.  And on Tuesdays and Wednesdays we now have Ingrid’s Lunch Box, once chosen by Bon Appetit as one of the nation’s top ten food carts.  Meanwhile, two recession-victim restaurants, Africana and King of Falafel, have reinvented themselves as carts; haven’t tried either, but K of F was dependably good when it was a building.

## Post-colonial agriculture

I didn’t know that lots of commercial crops are pollinated by giant trucked-in hives of honeybees.  I also didn’t know that these hives are being decimated by Colony Collapse Disorder, which is Science for “the hive’s empty except for the queen, there’s no dead bees around, we basically don’t know what’s going on.”  UW researchers are studying the Wisconsin cranberry crop, assessing the feasibility of going back to native pollinators like bumblebees, squash bees, and leaf-cutter bees.  I hope the work will be done in time to save CranFest!

Leaf-cutter bee

## Prosecutor’s fallacy — now with less fallaciousness!

I gave a talk at East HS yesterday about “not getting fooled by probability,” in which I talked about the Diana Sylvester murder case, discussed previously on the blog as an example of the prosecutor’s fallacy.  While getting ready for the talk I came across this post about the case by Penn State law professor David Kaye, which explains how, in this case, the calculation proposed by the defense was wrong in its own way.

Here’s the setup.  You’ve got DNA found at the scene of the crime, with enough intact markers that the chance of a random person matching the DNA is 1 in 1.1 million.  You’ve got a database of 300,000 convicted violent criminals whose DNA you have on file.  Out of these 300,000, you find one guy — otherwise unconnected with the crime — who matches on all six.  This guy gets arrested and goes to trial.  What should you tell the jury about the probability that the defendant is guilty?

## Rush Hour, Jr.

OK, so a black toddler and a Chinese toddler stumble on an international drug-trafficking ring — no, actually, this is a game I just bought for CJ, a kid’s version of Nob Yoshigahara‘s classic game Rush Hour.  The object here is to get the small white truck to the edge of the board (the top edge, in the image here.)  The trucks in your way can’t move sideways or turn; they just go forward and back.

You play a captivating game like this and naturally you start abstracting out the underlying math problem.  Play Set enough and you can’t avoid thinking about affine capsRush Hour has more to do with the geometry of configuration spaces; it reminds me of the “disk in a box” problems that people like Persi Diaconis and Matt Kahle work on.

So here’s a question — it doesn’t capture all the features of Rush Hour, but let’s start here.  Let X be the unit square, and let c be a parameter between 0 and 1, and let N be a large integer.  Let L be the set of line segments in X which are either horizontal of the form y = i/N or vertical of the form x = i/N.  A traffic jam is a choice of a length-c interval in each of the 2N +2 line segments in L, where we require that these intervals be pairwise disjoint.  The traffic jams naturally form a topological space, which we call T(N,c).  We say an interval (x,i/n),(x+c,i/n) in a traffic jam t is trapped if no traffic jam in the connected component of t contains the interval (0,i/n),(c,i/n).

Questions: For which values of (N,c) is T(N,c) connected?  In particular, is it connected almost always once it’s nonempty?  If not, when does T(N,c) have a “giant component”?  If there’s an interesting range of parameters where T(N,c) is not connected, what proportion of intervals do we expect to be trapped?

## The Ask, “The Sentence is a Lonely Place,” fudge

I have an ideological commitment to a somewhat unpopular theory of prose fiction: that features like “character”, “plot,” and “setting” are epiphenomena that arise after the fact, more or less by accident; that writers write sentences, then go back and see what characters, plot, and setting the sentences add up to, then revise the sentences to avoid any obvious incongruities.

But when someone states the claim as starkly, and argues for it as forcefully, as Gary Lutz does in “The Sentence is a Lonely Place,” I start to doubt my own belief in it.  Hard to get the sense of Lutz’s approach from a short excerpt, so here’s a long one:

## If I Had a Hifi

Nada Surf just released a covers album called “If I Had a Hifi” which, strangely and disappointingly, doesn’t include a cover of the Spare Snare song of the same title.  I couldn’t find that track available freely on the web, so here’s the even better “As A Matter of Fact.”   Jan Burnett used to play a guitar with only two strings, and from this device elicited a variety and quality of sound that astounds to this day.

## Motivic puzzle: the moduli space of squarefree polynomials

As I’ve mentioned before, the number of squarefree monic polynomials of degree n in F_q[t] is exactly q^n – q^{n-1}.

I explained in the earlier post how to interpret this fact in terms of the cohomology of the braid group.  But one can also ask whether this identity has a motivic interpretation.  Namely:  let U be the variety over Q parametrizing monic squarefree polynomials of degree d.  So U is a nice open subvariety of affine n-space.  Now the identity of point-counts above suggests the question:

Question: Is there an identity [U] = [A^n] – [A^{n-1}] in the ring of motives K_0(Var/Q)?

I asked Loeser, who seemed to feel the answer was likely yes, and pointed out to me that one could also ask whether the two classes were identical in the localization K_0(Var/Q)[1/L], where L is the class of A^1.  Are these questions different?  That is, is there any nontrivial kernel in the natural map K_0(Var/Q) -> K_0(Var/Q)[1/L]?  This too is apparently unknown.

Here, I’ll start you off by giving a positive answer in the easy case n=2!  Then the monic polynomials are parametrized by A^2, where (b,c) corresponds to the polynomial x^2 + bx + c.  The non-squarefree locus (i.e. the locus of vanishing of the discriminant) consists of solutions to b^2 – 4c = 0; the projection to c is an isomorphism to A^1 over Q.  So in this case the identity is indeed correct.

Update:  I totally forgot that Mike Zieve sent me a one-line argument a few months back for the identity |U(F_q)| = q^n – q^{n-1} which is in fact a proof of the motivic identity as well!  Here it is, in my paraphrase.

Write U_e for the subvariety of U consisting of degree-d polynomials of the form a(x)b(x)^2, with a,b monic, a squarefree, and b of degree e.  Then U is the union of U_e as e ranges from 1 to d/2.  Note that the factorisation as ab^2 is unique; i.e, U_e is naturally identified with {monic squarefree polynomials of degree d-2e} x {monic polynomials of degree e.}

Now let V be the space of all polynomials (not necessarily monic) of degree d-2, so that [V] = [A^{n-1}] – [A^{n-2}].  Let V_e be the space of polynomials which factor as c(x)d(x)^2, with d(x) having degree e-1.  Then V is the union of V_e as e ranges from 1 to d/2.

Now there is a map from U_e to V_e which sends a(x)b(x)^2 to a(x)(b(x) – b(0))^2, and one checks that this induces an isomorphism between V_e x A^1 and U_e, done.

But actually, now that I think of it, Mike’s observation allows you to get the motivic identity even without writing down the map above:  if we write $U^d_e$ for the space of monic squarefrees of degree d in stratum e, then $U^d_e = U_{d-2e} \times \mathbf{A}^e$, and then one can easily compute the class $U^d_0$ by induction.