Pandemic blog 23: why one published research finding is misleading

I really like John Ioannidis: his famous 2005 article “Why Most Published Research Findings are False” probably did more than any other paper to draw attention to the problems with blind use of p-value certification in medicine.

But he has a preprint up on medrxiv today that is really poorly done, so much so that it made me mad, and when I get mad, I blog.

Ioannidis has been saying for months that the COVID-19 pandemic, while bad, is not as bad as people think. Obviously this is true for some value of “people.” And I think he is right that the infection fatality rate, or IFR, is in most places not going to be as high as the 0.9% figure the March 16 Imperial College model used as an estimate. But Ioannidis has a much stronger claim; he thinks the IFR, in general, is going to be about 1 or 2 in a thousand, and in order to make that case, he has written a paper about twelve studies which show a high prevalence of antibodies in populations where not very many people have died. High prevalence of infection + few deaths = low IFR.

I think I am especially irritated with this paper because I agree that the IFR now looks lower than it looked two months ago, and I think it’s important to have good big-picture analysis to back that intuition up — and this isn’t it. There’s a lot wrong with this paper but I just want to focus on one thing that jumped out at me as especially wrong, and that is Ioannidis’s treatment of the Netherlands antibody study.

That study found that in blood donors, all ages 18-72 (Ioannidis says <70, not sure why), 2.7% showed immunity. Ioannidis reports this, then makes the following computation. About 15m of the 17m people in the Netherlands are under 70, so this suggests roughly 400,000 people in that age group had been infected, of whom only 344 had died at the time of the study, giving an IFR of a mere 0.09%. Some plague! Ioannidis puts this number in his table and counts it among those of which he writes “Seven of the 12 inferred IFRs are in the range 0.07 to 0.20 (corrected IFR of 0.06 to 0.16) which are similar to IFR values of seasonal influenza.”

But of course the one thing we really do know about COVID, in this sea of uncertainty, is that it’s much, much more deadly to old people. The IFR for people under 70 is not going to be a good estimate for the overall IFR.

I hashed out some numbers — it looks to me like, using the original March 16 Imperial College estimates, derived from Wuhan, you would derive an infection fatality rate of about 0.47% among people age 20-70. There are about 10.8m Dutch people in that range (I am taking all this from Wikipedia data on the age distribution of the Netherlands) so if 2.7% of those are infected, that’s about 300,000 infections, and 344 deaths in that group is about 0.11%. Lower than the Imperial estimate! But four times lower, not ten times lower.

What about the overall IFR? That, after all, is what Ioannidis’s paper is about. If you count the old people who died, the toll as of April 15 wasn’t 344, it was over 3100. If the 2.7% prevalence rate were accurate as a population-wide estimate, the total number of infected people would be about 460,000, for an IFR of 0.67%, more than seven times higher than the figure Ioannidis reports (though still a little lower than the 0.9% figure in the Imperial paper.) Now we definitely don’t know that the infection rate among old Dutch people is the same as it is in the overall population! But even if you suppose that every single person over 70 in the country is infected, that gets you to a little over 2 million infections, and an IFR of 0.15%. In other words, the number reported by Ioannidis is substantially lower than the theoretical minimum the IFR could actually be. And of course, it’s not the case that everybody over 70 already had COVID-19 in the middle of April. (For one thing, that would make the IFR for over-70s only slightly higher than the IFR overall, which contradicts the one thing about COVID we really know!)

There’s no fraud here, I hasten to say. Ioannidis tells you exactly what he’s doing. But he’s doing the wrong thing.

Tagged , , ,

11 thoughts on “Pandemic blog 23: why one published research finding is misleading

  1. In addition: The Dutch themselves are saying that they underestimate their covid 19 deaths. Their excess mortality is higher than their reported covid19 deaths.

  2. I think he has done a similar thing with the Scottish death figures. He has 47 which, according to the spreadsheet here, is the number of under-65 deaths up until March 30th:

    The all-ages deaths at that same point seems to be 354. He’s effectively calculating the under-65 or under-70 IFR for some of the data sets, which is fine, but like you say it shouldn’t be presented as the whole population IFR, which will be significantly higher.

  3. Jack Syage says:

    I was about to comment on the Iaonnidis paper on MedRxiv and saw the above comment. I agree this paper is flawed and here are my reasons. Most of these studies he uses were conducted before the death rate peak. Deaths represent infections from about 2.5 weeks before whereas antibody measurements are current. So cases have grown by multiples by then. As a check I see the following trend in Table 3: the earliest dates show the lowest IFR’s (since growing cases run way ahead of deaths) and latest dates show the highest IFR’s (as cases are subsiding and catching up to deaths). So I plotted this and there is a distinct upward dependence for IFR vs. date with a Pearson coeff of 0.61 (pretty strong) and a 2-tailed, paired t-value of a staggering p = 0.00003.

    I suspect continued antibody tests for populations well past the death rate peak will start to converge on a higher value of IFR, e.g., about 1%.

    I have been doing modeling and interested in views: please check out:



  4. Jack Syage says:

    Call this a sanity check refuting the Iaonnides claim or indicating good news. If IFR = 0.1% in say the U.S. then 90,000 deaths means 90M cases. If we assume a final death total of 150,000, then that means nearly 50% of the population will have been infected and would have herd immunity. This would be certainly true for the northeast. Imagine the tourist campaign for NY, Boston, etc. saying “come here to visit, we’re safe!”

  5. anon says:

    When Ioannidis made similar claims in an earlier paper (or earlier version of this paper?), many people pointed out that if IFR really was as low as he claimed, that would mean that everyone in New York had already been infected. It seems that Ioannidis didn’t listen.

  6. 1214876543114786327167890987654 says:

    > And I think he is right that the infection fatality rate, or IFR, is in most places not going to be as high as the 0.9% figure the March 16 Imperial College model used as an estimate.

    > I agree that the IFR now looks lower than it looked two months ago

    I’m really interested in seeing well-argued counterpoints or solid studies that would suggest mortality lower than 1 %. Seroprevalence studies in general do not count, unless they have very convincing random sampling, large N and neutralisation assays for positives.

  7. Rai.Zure says:

    There are now plenty of country-wide prevalence studies from Europe: Austria,
    Czech Republic, Spain.. and this data unfortunately shows the “population wide” IFR is more likely to be around 1-2% than “similar to a flu”.

    Also.. wondering why nobody emphasizes that flu mortality estimates are almost always CFR, not IFR? Almost anyone trying to compare flu with COVID-19 is comparing apples with oranges?

    Another data-point, Taiwan did long have very few dead but recently picked up. Although the case numbers are (luckily!) ridiculously low they allow an interesting estimate as most cases were either imported or on navy ships. Here 349 deaths from imported cases (mostly foreign workers) correspond to 4 dead – an IFR of roughly 1.2%. Even worse, the dead themselves were in their 40s-70s and obviously biased to “healthier than average” as they were fit for overseas travel and work.

  8. […] commercial buildings could put returning employees at risk of Legionnaires’ and other illnesses. Why one published research finding is misleading Washing your hands is better than disposable gloves for preventing COVID-19 spread, Public Health […]

  9. […] who favors this position is John Ionnadis, but I find him unconvincing for reasons laid out here, here, and, quite loosely, […]

  10. If John Ioannidis is right, and the IFR is 0.1%, even for just the under 70 set, then we are not only rapidly approaching herd immunity, we are also experiencing thermodynamic miracles all over the USA – in Florida, Texas, (especially) Arizona, and the list goes on.

    tldr: There is no need to even look at his paper to figure out that he is just plain wrong.

  11. moregreens007 says:

    So, basically:

    1. We can reasonably say the IFR of 18-72 people in Netherlands is 0.09%. Yes?

    2. We don’t know overall IFR.

    3. We don’t know IFR of over-70.

    4. We don’t know IFR of 0-18.

    Is this accurate? Presuming it is, in the real world we will never have full information. As a public health or national leader if I were to make a decision based on the above data, and if #2-4 were unavailable, I would find Ioannidis’ study immensely helpful:

    a) I could speculate, reasonably, based on data from around the world about the virus’ behavior in 0-14 age group, that #4 kids group above would have an IFR even lower than 0.09%. Right? This means schools can and should reopen.

    b) I could speculate, again reasonably and as you also suggest, that the IFR of elderly (although 70+ people in Netherlands are fairly healthy in’s a bicycling and walking culture, but let’s let that be) will be much higher than 0.09%. Right? This means I’d make a special effort to protect those elderly, although I’d give them an option to ‘play with danger’ if they chose to do so. So I wouldn’t mandate isolation.

    c) About the population we do know about, 18-72, I’d be far more bold to allow them to do their thing. Perhaps make it even tighter and say 18-65 (working age).

    As such, this is a useful piece of work. IFR is notoriously impractical to calculate in the real world because it’d require constant random sero testing of all population. So we use informed shorthand.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: