I interviewed Nate Silver last month at the Commonwealth Club in San Francisco for an MSRI event. Video here.
I interviewed Nate Silver last month at the Commonwealth Club in San Francisco for an MSRI event. Video here.
My colleague Steph Tai at the law school wrote a long, amazing Facebook message to me about the question Cathy and I have been pawing at: when and in what spirit should we be listening to experts? It was too good to be limited to Facebook, so, with her permission, I’m reprinting it below.
Steph deals with these issues because her academic specialty is the legal status of scientific knowledge and scientific evidence. So yes: in a discussion on whether we should listen to experts I am asking you to listen to the opinions of an expert on expertise.
Also, Steph very modestly doesn’t link to her own paper on this stuff until the very bottom of this post. I know you guys don’t always read to the bottom, so I’ve got your link to “Comparing Approaches Toward Governing Scientific Advisory Bodies on Food Safety in the United States and the European Union” right here!
And now, Steph:
Some quick thoughts on this very interesting exchange. What might be helpful, to take everyone out of our own political contexts, perhaps, is to contrast this discussion you’re both having regarding experts and financial models with discussions about experts and climate models, where, it seems, the political dynamics are fairly opposite. There, you have people on the far right making similar claims to Cathy: that climate scientists are to be distrusted because they’re just coming up with scare models because these allegedly biased models are useful to those climate scientists–i.e., to bring money to left-wing causes, to generate grants for more research, etc.
So when you apply the claim that Cathy makes at the end of her post–“If you see someone using a model to make predictions that directly benefit them or lose them money – like a day trader, or a chess player, or someone who literally places a bet on an outcome (unless they place another hidden bet on the opposite outcome) – then you can be sure they are optimizing their model for accuracy as best they can. . . . But if you are witnessing someone creating a model which predicts outcomes that are irrelevant to their immediate bottom-line, then you might want to look into the model yourself.”–I’m not sure you can totally put climate scientists in that former category (of those that directly benefit from the accuracy of their predictions). This is due to the nature of most climate work: most researchers in the area only contribute to one tiny part of the models, rather than produce the entire model themselves (thus, the incentives to avoid inaccuracies are diffuse rather than direct); the “test time” for the models are often relatively far into the future (again, making the incentives more indirect); and the sorts of diffuse reputational gains that an individual climate scientist gets from being part of a team that might partly contribute to an accurate climate model is far less direct than the examples given of day traders and chess players or “someone who literally places a bet on an outcome.”
What that in turn seems to mean is that under Cathy’s approach, climate scientists would be viewed as in the latter category—those creating models that “predict outcomes that are irrelevant to their immediate bottom-line,” and thus deserve people looking “into the model [themselves].” But at least from what I’ve seen, there is *so* much out there in terms of inaccurate and misleading information about climate models (by folks with stakes in the *perception* of those models) that chances are, a lay person’s inquiry into climate models has high chance to being shaped by similar forces with which Cathy is (in my view appropriately) concerned. Which in turn makes me concerned about applying this approach.
Disclaimer: I used to fall under this larger umbrella of climate scientists, though I didn’t work on the climate models themselves, just one small input to them—the global warming potentials of chlorofluorocarbon substitutes. So this contrast is not entirely unemotional for me. That said, now that I’m an academic who studies the *use* of science in legal decisionmaking (and no longer really an academic who studies the impact of chlorofluorocarbon substitutes on climate), I don’t want to be driven by these past personal ties, but they’re still there, so I feel like I should lay them out.
So what’s to be done? I absolutely agree with Cathy’s statement that “when independent people like myself step up to denounce a given statement or theory, it’s not clear to the public who is the expert and who isn’t.” It would seem, from what she says at the end of her essay, that her answer to this “expertise ambiguity” is to get people to look into the model when expertise is unclear.[*] But that in turn raises a whole bunch of questions:
(1) What does it take to “look into the model yourself”? That is, how much understanding does it take? Some sociologists of science suggest that translational “experts”–that is, “experts” who aren’t necessarily producing new information and research, but instead are “expert” enough to communicate stuff to those not trained in the area–can help bridge this divide without requiring everyone to become “experts” themselves. But that can also raise the question of whether these translational experts have hidden agendas in some way. Moreover, one can also raise questions of whether a partial understanding of the model might in some instances be more misleading than not looking into the model at all–examples of that could be the various challenges to evolution based on fairly minor examples that when fully contextualized seem minor but may pop out to someone who is doing a less systematic inquiry.
(2) How does a layperson avoid, in attempting to understand the underlying model, the same manipulations by those with financial stakes in the matter–the same stakes that Cathy recognizes might shape the model itself? Because that happens as well, so that even if one were to try to look into a model themselves, the educational materials it would take to look into that model can be also argued to be developed by those with stakes in the matter. (I think Cathy sort of raises this in a subsequent post about how entire subfields can be regarded as “captured” by particular interests.)
(3) (and to me this is one of the most important questions) Given the high degree of training it takes to understand any of these individual areas of expertise, and given that we encounter so many areas in which this sort of deeper understanding is needed to resolve policy questions, how can any individual actually apply that initial exhortation–to look into the model yourself–in every instance where expertise ambiguity is raised? To me that’s one of the most compelling arguments in favor of deferring to experts to some extent–that lay people (as citizens, as judges, as whatever) simply don’t have time to do the kind of thing that Cathy suggests in every situation where she argues it’s called for. Expert reliance isn’t perfect, sure–but it’s a potentially pragmatic response to an imperfect world with limited time and resources.
Do my thoughts on (3) mean that I think we should blindly defer to experts? Absolutely not. I’m just pointing it out as something that weighs in favor of listening to experts a little more. But that also doesn’t mean that the concerns Cathy raises are unwarranted. My friend Wendy Wagner writes about this in her papers on the production of FDA reports and toxic materials testing, and I find her inquiries quite compelling. P.s. I should also plug a work of hers that seems especially relevant to this conversation. It suggests that the part of Nate Silver’s book that might raise the most concerns (I dunno, because I haven’t read it) is its potential contribution to the vision of models as “truth machines,” rather than understanding that models are just one tool to aid in making decisions, and a tool which must be contextualized (for bias, for meaningfulness, for uncertainty) at that.
So how to address this balance between skepticism and lack of time to do full inquiries into everything? I totally don’t have the answers, though the kind of stuff I explore are procedural ways to address these issues, at least when legal decisions are raised–for example,
* public participation processes (with questions as to both the timing and scope of those processes, the ability and likelihood that these processes are even used, the accessibility of these processes, the susceptibility of “abuse,” the weight of those processes in ultimate decisionmaking)
* scientific ombudsman mechanisms (which questions of how ombudsman are to be selected, the resources they can use to work with citizen groups, the training of such ombudsmen)
* the formation of independent advisory committees (with questions of the selection of committee members, conflict of interest provisions, the authority accorded to such committees)
* legal case law requiring certain decisionmaking heuristics in the face of scientific uncertainty to avoid too much susceptibility to data manipulation (with questions of the incentives those heuristics create for actual potential funders of scientific research, the ability of judges to apply such heuristics in a consistent manner)
–as well as legal requirements that exacerbate these problems. Anyway, thanks for an interesting back and forth!
[*] I’m not getting into the question of “what makes someone an expert?” here, and instead focus on “how do we make decisions given the ambiguousness of who should be considered experts?” because that’s more relevant to what I study, although I should also point out that philosophers and sociologists of science have been studying this in what’s starting to be called the “third wave” of science, technology, and society studies. There’s a lot of debate about this, and I have a teensy summary of it here (since Jordan says it’s okay for me to plug myself :) (Note: the EFSA advisory committee structure, if anyone cares, has changed since I published this article so that the article characterizations are no longer accurate.)
Cathy goes off on Nate Silver today, calling naive his account of well-meaning people saying false things because they’ve made math mistakes. In Cathy’s view, people say false things because they’re not well-meaning and are trying to screw you — or, sometimes, because they’re well-meaning but their incentives are pointed at something other than accuracy. Read the whole thing, it’s more complicated than this paraphrase suggests.
Cathy, a fan of and participant in mass movements, takes special exception to Silver saying:
This is neither the time nor the place for mass movements — this is the time for expert opinion. Once the experts (and I’m not one of them) have reached some kind of a consensus about what the best course of action is (and they haven’t yet), then figure out who is impeding that action for political or other disingenuous reasons and tackle them — do whatever you can to remove them from the playing field. But we’re not at that stage yet.
…I have less faith in the experts than Nate Silver: I don’t want to trust the very people who got us into this mess, while benefitting from it, to also be in charge of cleaning it up. And, being part of the Occupy movement, I obviously think that this is the time for mass movements.
From my experience working first in finance at the hedge fund D.E. Shaw during the credit crisis and afterwards at the risk firm Riskmetrics, and my subsequent experience working in the internet advertising space (a wild west of unregulated personal information warehousing and sales) my conclusion is simple: Distrust the experts.
I think Cathy’s distrust is warranted, but I think Silver shares it. The central concern of his chapter on weather prediction is the vast difference in accuracy between federal hurricane forecasters, whose only job is to get the hurricane track right, and TV meteorologists, whose very different incentive structure leads them to get the weather wrong on purpose. He’s just as hard on political pundits and their terrible, terrible predictions, which are designed to be interesting, not correct.
Cathy wishes Silver would put more weight on this stuff, and she may be right, but it’s not fair to paint him as a naif who doesn’t know there’s more to life than math. (For my full take on Silver’s book, see my review in the Globe.)
As for experts: I think in many or even most cases deferring to people with extensive domain knowledge is a pretty good default. Maybe this comes from seeing so many preprints by mathematicians, physicists, and economists flushed with confidence that they can do biology, sociology, and literary study (!) better than the biologists, sociologists, or scholars of literature. Domain knowledge matters. Marilyn vos Savant’s opinion about Wiles’s proof of Fermat doesn’t matter.
But what do you do with cases like finance, where the only people with deep domain knowledge are the ones whose incentive structure is socially suboptimal? (Cathy would use saltier language here.) I guess you have to count on mavericks like Cathy, who’ve developed the domain knowledge by working in the financial industry, but who are now separated from the incentives that bind the insiders.
But why do I trust what Cathy says about finance?
Because she’s an expert.
Is Cathy OK with this?
This, from Politico’s Dylan Byers, is infuriating:
Prediction is the name of Silver’s game, the basis for his celebrity. So should Mitt Romney win on Nov. 6, it’s difficult to see how people can continue to put faith in the predictions of someone who has never given that candidate anything higher than a 41 percent chance of winning (way back on June 2) and — one week from the election — gives him a one-in-four chance, even as the polls have him almost neck-and-neck with the incumbent.
Why? Why is it difficult to see that? Does Dylan Byers not know the difference between saying something is unlikely to happen and declaring that it will not happen?
Silver cautions against confusing prediction with prophecy. “If the Giants lead the Redskins 24-21 in the fourth quarter, it’s a close game that either team could win. But it’s also not a “toss-up”: The Giants are favored. It’s the same principle here: Obama is ahead in the polling averages in states like Ohio that would suffice for him to win the Electoral College. Hence, he’s the favorite,” Silver said.
For all the confidence Silver puts in his predictions, he often gives the impression of hedging. Which, given all the variables involved in a presidential election, isn’t surprising. For this reason and others — and this may shock the coffee-drinking NPR types of Seattle, San Francisco and Madison, Wis. — more than a few political pundits and reporters, including some of his own colleagues, believe Silver is highly overrated.
Hey! That’s me! I live in Madison, Wisconsin! I drink coffee! Wait, why was that relevant again?
To sum up: Byers thinks Nate Silver is overrated because he “hedges” — which is to say, he gives an accurate assessment of what’s going on instead of an inaccurate one.
This makes me want to stab my hand with a fork.
I’m happy that Ezra Klein at the Post decided to devote a big chunk of words to explaining just how wrong this viewpoint is, so I don’t have to. You know what, though, I’ll bet Ezra Klein drinks coffee.
Or so I argue in today’s Boston Globe, where I review Silver’s excellent new book. I considered trying to wedge a “The Signal and The Noise” / “The Colour and the Shape” joke in there too, but it was too labored.
Prediction is a fundamentally human activity. Just as a novel is no less an expression of human feeling for being composed on a laptop, the forecasts Silver studies — at least the good ones — are expressions of human thought and belief, no matter how many theorems and algorithms forecasters bring to their aid. The math serves as a check on our human biases, and our insight serves as a check on the computer’s bugs and blind spots. In Silver’s world, math can’t replace or supersede us. Quite the contrary: It is math that allows us to become our wiser selves.
Lots of people are following Nate Silver’s election tracking over at 538, especially his top-line estimate of the probability that Barack Obama will be re-elected in November. Silver has that number at 79.7% today. Sounds like good news for Obama. But it’s hard to get a gut feeling for what that number means. Yeah, it means Obama has a 4 in 5 chance of winning — but since the election isn’t going to happen 5 times, that proportion doesn’t quite engage the intuition.
Here’s one trick I thought of, which ought to work for baseball fans. The Win Probability Inquirer over at Hardball Times will estimate the probability of a baseball team winning a game under any specified set of conditions. Visiting team down by 4 in the 2nd, but has runners on 2nd and 3rd with nobody out? They’ve got a 26% chance of winning. Next batter strikes out? Their chances go down to 22%.
So when do you have a 79.7% of winning? If we consider the Obama-Romney race to have started in April or May, when Romney wrapped up the nomination, we’re about 2/3 of the way through — so let’s the 7th inning. If the visiting team is ahead by 2 runs going into the 7th, they’ve got an 82% chance of winning. That’s pretty close. If you feel the need to tweak the knobs, say the first two batters of the inning fail to reach; with two outs in the top of the 7th, bases empty and a 2-run lead, the visitors win 79.26% of the time, just a half-percent off from Silver’s estimate.
So: Obama 5, Romney 3, top of the 7th. How certain do you feel that Obama wins?
Update: (request from the comments) Silver currently has Obama with an 85.% chance of winning. That’s like: home team up 5-3, visitors batting in the top of the 8th, runner on first with one out.
Nate Silver at 538 looks at the trailing digits of about 5000 poll results from secretive polling outfit Strategic Vision, finds a badly non-uniform distribution, and says this strongly suggests that SV is making up numbers. I’m a fan of Nate’s stuff, both sabermetric and electoral, but I’m not so sure he’s right on this.
Nate’s argument is similar to that of Beber and Scacco’s article on the fraudulence of Iran’s election returns. Humans are bad at picking “random” numbers; so the last digits of human-chosen (i.e. fake) numbers will look less uniform than truly random digits would.
There are at least three ways Nate’s case is weaker than Beber and Scacco’s.
So I wouldn’t say, as Nate does, that the numbers compiled at 538 “suggest, perhaps strongly, the possibility of fraud.”
Update (27 Sep): More from Nate on the Strategic Vision digits. Here he directly compares the digits from Strategic Vision to digits gathered by the same protocol from Quinnipiac. To my eye, they certainly look different. I think this strengthens his case. If he ran the same procedure for five other national pollsters, and the other five all looked like Quinnipiac, I think we’d be in the position of saying “There is good evidence that there’s a methodological difference between SV and other pollsters which has an effect on the distribution of terminal digits.” But it’s a long way from there to “The methodological difference is that SV makes stuff up.”
On the other hand, Nate remarks that the deviation of the Quinnipiac digits from uniformity is consistent with Benford’s Law. This is a terrible thing to remark. Benford’s law applies to the leading digit, not the last one. The fact that Nate would even bring it up in this context makes me feel a little shaky about the rest of his computations.
Also, there’s a good post about this on Pollster by Mark Blumenthal, whose priors about polling firms are far more reliable than mine.