What is the Probability of the Lab Leak Hypothesis?

Originally forbidden from public discussion for all of 2020, the lab leak hypothesis as the origin of SARS-CoV-2 (SARS2) has recently gained intellectual support, and President Biden just publicly announced he was investigating the hypothesis for months. Despite no direct evidence appearing for either side, it suddenly seems like it’s closer to 50/50 now, after a year of public discourse implying something like 1% lab leak/99% natural origin. One thing we’re all wondering is, how likely is it that this was a lab leak?

I am not a virologist. My day job is quant trading, and I regularly think about probabilities of one-off events, updating them in real time. So my methods might be alien, but with that, let’s guess the probability of lab leak vs natural origin. I attempted the more reliable solutions first, but given the lack of evidence for either side, most of this is guesswork.

I consider lab leak to include accidental escape of either engineered or nonengineered virus.

My current belief is 77% lab leak, 23% natural. All of the text below is not really a proof, just things to think about when forming your own opinion.

Betting Markets

The first thing that comes to mind is: does there exist a prediction market? Prediction markets in general can be accurate when the outcomes are well defined and people can bet a lot of money on things—any inaccuracy can be “arbitraged” away since there is a financial incentive to trade for people who are good at guessing. For example, in the 2020 US presidential election, after briefly swinging towards Trump after Miami-Dade results, prediction markets were all heavily favoring Biden by the early morning after election day, at a time when it looked like Trump held a significant lead in most of the remaining states. While states like Pennsylvania and Georgia took multiple days for the counter to swing Democrat, the betting markets already knew that would happen less than 12 hours after polls closed. The point is, the market already corrected for the mail-in ballot effect, because if it didn’t, someone who correctly thought about the effect could make a lot of money trading in it, thus pushing the market towards a more accurate truth.

However, as far as I can tell, there is no liquid market for the lab leak hypothesis. Even if there were, the definition of such a contract would matter a lot—maybe in the case that a small group of people in China really did know there was a lab leak, the probability of conceding this could be very low, especially as they would have had ample time in the last 17 months to destroy evidence. So even a contract that says “The WHO thinks SARS2 was a lab leak by 2023” wouldn’t necessarily indicate the true probability of a lab leak, because you might expect there to be bias in which result would be more willingly revealed to the public—a bias which people trading the contract would certainly be aware of.

Even without an official prediction market, some people make bets openly on the internet. So one thing I could do is draw out a contract, put out a market, and solicit people to trade with me. But I think the issue is that the contract is too hard to define here. As opposed to a fixed event—e.g. “Will the Democratic Party win the 2024 presidential election” would have widespread agreement about who wins a presidential election—a contract like “Was COVID-19 a lab leak” is very subject to opinion and ambiguity. You could try to more well-define the contract, e.g.

  • “The WHO thinks SARS2 was a lab leak by 2022”, or
  • “I will ask Matt Yglesias on Jan Dec 31, 2022 if he thought it was more likely (no tie) a lab leak or natural, what will he say”, or
  • “Will the Chinese Communist Party announce a statement agreeing it was a lab leak by 2022”

All 3 of these should have different values. Probably the Chinese government admitting a lab leak is the least likely, so they would trade at different betting odds, despite that the origin of SARS2 has already happened a year and a half in the past, and nothing we do in terms of framing the question now will change what actually happened.

Between framing ambiguity, the taxes in prop betting, and worrying about counterparty risk, I would start with a very wide market and it wouldn’t be very helpful in answering the original question, So I put betting markets aside for the moment.

Ask a Superforecaster

The next thing to find people who are really good at predicting things, and see what they think. The only reputable superforecaster I’ve found who actually publicly gave a straight probability is Nate Silver, who on 5/23/2021 assigned 60% chance to lab leak, and 40% to natural origin.

Note that in the tweet, Silver previously had 40% lab leak, and updated to 60% based on the recent WSJ article documenting that three Wuhan Institute of Virology researchers became sick enough to seek hospital care in November 2019.

I am probably missing many other public guesses. I will consider them if pointed out to me.

Poll the Virologists

For a number of reasons, I think polling virologists is futile for uncovering the origins of SARS2.

  • This is a highly politicized topic, so there are many selection biases to correct for, which would seem very difficult and subjective to do:
    • People with which political leanings are more likely to become virologists?
    • What kinds of people are the ones who would speak out publicly, especially if it could cost them their job, i.e. canceled for political leanings?
    • Is there a direct occupational conflict of interest? I.e., a virologist guesses that if they spoke favoring lab leak, more public scrutiny and distrust would occur for labs, possibly defunding them, and causing said virologist to lose their job. So they decide to stay quiet.
  • Are virologists able to accurately quantify beliefs in terms of probabilities?
    • In particular, thinking about the origins of SARS2 without hard evidence requires knowing about priors and Bayesian updating—I’m not sure virologists as a whole are the most equipped on how to think of weighting priors and evidence properly, which superforecasters are better equipped to do.
  • I haven’t really seen anyone throw out numbers, just noncommitive words like “likely” or “extremely unlikely”.
  • Self-organized groups that make a claim tend to be people with similar beliefs, could be a vocal minority.
  • Political correctness aspects of the debate—e.g. things like “In terms of policy, it doesn’t matter which origin hypothesis is correct, so we should publicly support zoonotic hypothesis because that would lead to less anti-Asian-American racism.”
  • There is already a discredit of many virologists who came out publicly in early 2020 to strongly argue that the evidence was overwhelmingly in favor of natural origin, yet now the mainstream belief is that the two hypotheses are close enough in probability to be very unclear.

There are definitely many cases—probably the vast majority of situations—to trust experts. If you wanted to find if ingesting 1mg of cyanide is fatal, probably toxicologists would have very good published answers. That question seems very easy to settle in a repeated experiment and doesn’t have much political or selection biases, whereas the situation with SARS2 is filled with them. With that, we turn to more handwavy methods.

Debiasing and the Chinese Government

It would take many books to spell out the details of all the human cognitive biases that could hamper our thinking about the origin of SARS2. I wanted to point out the main, super-important selection bias.

The main bias is that the Chinese government, which effectively controls the investigation into the origins of SARS2, has a huge incentive to cover up a hypothetical lab leak, and has precedent of doing major cover-ups in the past, though not necessarily related to biolabs that we know of. I feel like this might be obvious to some of you, but really, this is an huge effect. Especially if you’ve only lived in Western countries, you might not realize to what extent China censors facts in a 1984-esque way.

Quick history lesson—The United States has done many things in the past that no American alive today should be proud of. They might not be the things most emphasized—Native American genocide, slavery, chemical weapons, shootings of student protestors—however, as we live in a democracy with freedom of the press, you can easily find online articles (wiki pages are linked) or books or documentaries or movies about them. It’s very easy to argue that the bad things done by the modern Chinese government were much worse—the “Great Leap Forward” which killed between 15 million and 55 million people, or the Tiananmen Square Massacre where hundreds to thousands of student protestors were killed by the government. In China, events like these are censored—you can’t search them.

Think about the worst thing an American presidency did in 4 years, and compare it to this (emphasis mine):

The Great Leap Forward (Second Five Year Plan) of the People’s Republic of China (PRC) was an economic and social campaign led by the Chinese Communist Party (CCP) from 1958 to 1962. Chairman Mao Zedong launched the campaign to reconstruct the country from an agrarian economy into a communist society through the formation of people’s communes. Mao decreed increased efforts to multiply grain yields and bring industry to the countryside. Local officials were fearful of Anti-Rightist Campaigns and competed to fulfill or over-fulfill quotas based on Mao’s exaggerated claims, collecting “surpluses” that in fact did not exist and leaving farmers to starve. Higher officials did not dare to report the economic disaster caused by these policies, and national officials, blaming bad weather for the decline in food output, took little or no action. The Great Leap resulted in tens of millions of deaths, with estimates ranging between 15 and 55 million deaths, making the Great Chinese Famine the largest famine in human history

This was a horrific disaster for which no one took responsibility. For 20 years afterwards, the Chinese Communist Party’s official terminology for this period was the Three Years of Natural Disasters.”

Sound familiar?

This is all to say, when China says something about SARS2 that makes themselves look better, i.e. discredit the lab leak hypothesis, it doesn’t update my belief much at all. So which pieces of evidence should change our belief?

Bayesian Reasoning

Because the evidence surrounding the question is so scarce and opaque, we can’t use presumptions in a court of law like “we should assume 99.9% natural origin unless we find proof beyond reasonable doubt of lab leak,” or “we should assume 99.9% lab leak until we find proof of natural origin,” and then spend time debating which side has the burden of proof. There is just not enough evidence out there for either side. If we were trying to predict whether the Democrats or Republicans win the 2024 presidential election, it would be very weird to say, “There is currently a sitting Democratic president, so you need to convince me beyond reasonable doubt that a Republican to win for me to believe that Republicans have any chance of winning.” Instead, we can use polls, historical data, mathematical models to make educated guesses. The approach for guessing SARS2 origin should look much more like the latter.

We need to use Bayes rule—form a prior, and then update based on scarce evidence. The prior is some baseline probability for a novel virus being natural or manmade. But guessing the prior for SARS2 is exceedingly hard.

For many questions, a prior is easily formed by checking the base rate of events. For example, suppose you heard that a friend passed away last weekend, and the last thing you know is they drove to a weekend trip to camp in the wilderness, and you later found out there was a thunderstorm in the region where they camped. What would you think is more likely: they died of a car accident, or they died from being struck by lightning? You might think that wilderness + thunderstorm = fairly likely to be from lightning. You’d be very wrong. In actuality, you need to compare the base rates: 38,000 Americans die every year from auto-related accidents, while 49 die from being struck by lightning. So even knowing they might have been in an area with a thunderstorm, it’s probably still 99% likely that they died in a car accident on the way to the camp (that is, conditioned on that they died from either car accident or lightning). The prior is roughly that car accident death is 775x more likely than lightning (38000/49 = 775). Even if you add the evidence that your friend is a very, very careful driver who is 10x less likely to get in car accidents as the average person, now the updated odds are that a car accident is 77x more likely than lightning, as opposed to 775x, so it is still overwhelmingly likely to have been a car accident.

We want to figure out where a novel pathogen that has never spread en masse in human populations before came from. Unfortunately, we can’t really construct a base rate from looking at historical data, because as far as we know, there are zero known examples of lab leaks of novel pathogens!

Known pathogenNovel pathogen
Natural originViruses causing the common cold
Yersinia pestis (plague)
E coli
Salmonella
1918 Spanish flu
1976 Ebola virus
2003 SARS1
2012 MERS
Lab leak1974 anthrax leak in the Soviet Union
1978 smallpox leak in the UK
2003-2004 SARS1 leaks in Taiwan, Singapore, and China
2019 brucellosis leak in China
None publicly known

Why are there no known examples? It could be that it has never happened before. But it could also be from selection bias: (1) labs housing pathogens are relatively new in human history, and especially new is our ability to engineer new pathogens, called “gain of function”, and (2) anyone in charge of such a lab during a leak has a strong incentive to deny it—they have a lot to lose from having their lab shut down, and the cost of cover-up is pretty low: a new random disease came out, and you clean up the lab, no one would be any the wiser as to where it came from.

Strictly speaking, our main question concerns only the right side of the table. But thinking about the left side is necessary to establish a base rate. We know that labs around the world store pathogens ranging from very deadly (e.g., ebola) to relatively harmless (e.g., the common cold). Some thoughts:

  • If you (you being an American living in the continental US in all these examples) get the common cold, what was the chance it came from natural spread vs a lab leak?
    • 99.9% chance it was from natural spread, given that such a large percentage of the population catches the common cold. Though arguably, there might be some variant of the common cold that escaped and has a higher infectivity than a typical common cold, and maybe most of the cold cases now are from a lab escape. Maybe.
  • If you get ebola, what is the chance of natural origin vs lab leak?
    • Maybe 75% lab leak? It is much closer to 50-50 than the previous example, as ebola doesn’t often occur in the US, but there are occasionally outbreaks elsewhere. As the US does humanitarian aid and has lots of outgoing tourism, chances aren’t zero that it spread naturally. At the same time, we don’t know how many labs hold ebola that could have escaped from. It’s a tough question because we need to compare two probabilities that are very small. I honestly don’t know. And my guess is the answer for SARS2 is similar to this question.
    • Given this one is pretty close, it also depends a lot on other factors. For example, do you live right next to a biosafety-level-4 lab? If you did, then I’d guess it’s more like 95% lab leak. Things have escaped US labs, and I think it would be noncontroversial to claim that Chinese labs are lower in carefulness and safety.
  • if you get smallpox, an eradicated disease, what is the chance of natural origin vs lab leak?
    • 99% chance it was a lab leak. Zero cases have been reported to have naturally occurred in the US in decades, though leaks around the world do happen sometimes.

So which of these would a novel coronavirus similar to? Well, none of them, since these are all known viruses that scientists stored in a lab. As SARS2 is a novel virus, we are guessing both the chance that a lab in Wuhan such as the Wuhan Institute of Virology could leak a pathogen, and the chance that the lab either brought in novel bat coronaviruses for study or engineered a more infectious virus. I mostly believe that one of the latter was likely to have happened, from discussion in Nicolas Wade’s recent article on the origins of Covid.

Based on my intuitions on probabilities of lab leaks for the 3 cases above (common cold, ebola, smallpox) and understanding of Wuhan lab involvement in storing and engineering coronaviruses, I assign a prior of 67% lab leak vs 33% natural origin.

Now we need to update the prior on the few pieces of evidence we have:

  • “Three researchers from China’s Wuhan Institute of Virology became sick enough in November 2019 that they sought hospital care”, from the WSJ.
    • This is an obvious update towards lab leak, but by how much? Possibly by a lot, but I don’t know the base rate of how many people typically get sick from the WIV in a random November. If the base rate is 0 or 1, then we should probably update a lot, maybe by 22%? If the base rate is 2 or more, then we should update by almost zero. I’ll guess 50-50 between these two situations, such that we should update towards lab leak by 11%.
      • Math: I’d guess that in the case where the base rate of researchers getting sick is 0 or 1, the ratio P(3 WIV researchers sick | lab leak) to P(3 WIV researchers sick | natural origin) is 4 to 1. Then use Bayes’ rule to get P(lab leak | 3 WIV researchers sick) = 0.89. [Since my prior was 67% to 33%, we compute 0.67 * 4 vs 0.33 * 1, or 2.68 vs 0.33. Then 2.68/(2.68+0.33)=0.89.] Going from 67% to 89% is a +22% update. Since I only believe this story halfway (50% that the base rate of number of researchers going to the hospital in November is 2 or higher), I apply only half the update, or +11%.
      • Note that if my prior were far less confident of lab leak, say 30% lab leak/70% natural origin, the revelation that 3 researchers fell ill should still update my belief by a lot! Using Bayes rule results in 63% lab leak, an update of +33%, and believing only half the update means +16.5%, which is still a big update—an even bigger update than in my actual prior. If I thought 30% lab leak/70% natural origin before, I should now think 46.5% lab leak/53.5% natural origin.
      • Note the 4-to-1 ratio of P(3 WIV researchers sick | lab leak) to P(3 WIV researchers sick | natural origin) is a bit arbitrary. I could see an argument for this being lower, like 2:1. I could also easily see this number being higher, maybe 10:1! In the latter case, the Bayesian update is really large—the output of Bayes’ rule on the 30%/70% prior is now 81%(!), for an update of +51%. Believing only half of that, we get that the 30%/70% now becomes 56%/44%.
  • In Jan 2020, China began draconian lockdowns of major cities (“draconian” just means it was far more strict than anything we did in the West).
    • I’m very uncertain about this claim, but I think it’s a small update towards believing lab leak. This is because P(super lockdown | Chinese government knows it was a lab leak where “gain of function” was involved) > P(super lockdown | Chinese government is not sure what happened). A tiny +1% update towards lab leak?
  • China repeatedly claims its internal investigations suggest no evidence of lab leak.
    • From what I argued before, this updates my belief by almost 0.
  • WHO investigation says lab leak is extremely unlikely.
    • Tiny update towards natural origin, maybe 1%? Though again, since China effectively controls investigation into the lab, and this investigation took place months after the fact, I don’t put much weight on it.
  • Chinese vaccines are not as effective as Pfizer/Moderna ones.
    • Tiny update (1%) towards natural origin. If Chinese virologists were engineering a novel virus, maybe they had intricate knowledge of the virus and would know how to inoculate against it? Though it also seems very plausible that the US just has far superior R&D on vaccine development especially in mRNA technology. This signifies that Chinese scientists are not on the cutting edge of understanding viruses compared to their US counterparts, though you don’t need to be on the cutting edge to accidentally release a virus.
  • Variants are now responsible for most cases
    • I don’t know enough about viruses to make a claim as to which direction this should go, but I’m guessing the net effect is small enough to not matter.

In total, these are a +10% adjustment (+11% from WSJ article on researchers becoming sick, +1% from lockdowns, -1% from WHO investigation, -1% from vaccines). So from the evidence mentioned, my belief in lab leak went from 67% to 77%.

Meta Thoughts

I ended up with 77% lab leak, 23% natural origin. It’s almost certain at this point that my number is higher than most peoples’ estimates. The main things that account for this discrepancy are:

  • My prior is based on thought experiments on existing pathogens (ebola, smallpox)
  • I more heavily discount what China, WHO, and virologists say.
  • I update more strongly based on the evidence that 3 researchers went ill in Nov 2019.

77% is my just current belief, and it would update as there is new evidence.

I’m interested in seeing what other people’s probabilities are.

Beware Ideas that Are Beneficially Selected

The Murderous Tribe

Imagine two tribes of hunter-gatherers, 50 people each. Tribe One believes that killing is always wrong, while Tribe Two thinks killing is okay–so long as it’s a member of another tribe. During a harsh winter with low food levels, the two tribes venture outside their usual zones and run into each other. Tribe Two kills half of Tribe One and takes some of their food.

Now Tribe One has only 25 people, while Tribe Two still has 50. So the percentage of total population that believes killing is justified went from 50% to 67% (50 out of 75 is 67%).

Okay, well maybe that’s kind of misleading. The belief increasing from 50% to 67% wasn’t the result of 17% of people being convinced it was right. It is because the people who didn’t believe it were selected out of the population. Assuming all else equal, both tribes will eventually increase in population until the total population reaches 100 once again, the end effect will be as if 17 people converted.

What is going on is that being willing to kill members of other tribes is an evolutionarily beneficial idea.

In our example, we didn’t need to start with two tribes. There could have been 1000 tribes–50% pacifist, 50% violent. What happens when they repeatedly interact with each other in the long run? Most of the population become violent.

Biological organisms aren’t the only things that evolve via natural selection. Ideas do too.

Propagation of Ideas by Natural Selection

We’d like to think our beliefs are correct. Near 100% of people used to believe the Sun went around the Earth. Now we mostly think the opposite. “Earth orbits the Sun” is a factually correct idea that seemed to spread due to the merit of its accuracy.

Being correct is one way that an idea could gain traction. Having traits to help become naturally selected is another. “We should care about our own tribe more than others” seems like not a factually correct belief, or at least not an obviously correct one. It is popular because it was an evolutionarily advantageous belief–when there were collisions between believers and nonbelievers, those who did believe it were more inherently more likely to gain from collision.

Evolutionarily Advantaged Ideas

Here are four ways to increase the % of population that has a particular belief X:

  • Decrease the population of people who don’t believe X
  • Increase the population of people who believe X
  • Convince people who don’t believe X to believe X
  • Deter new people from believing alternatives to X

Ideas that inherently do one or more of these will be favored in selection. An idea is inherently advantaged if acting out on that idea causes the % of people with that idea to increase. Heliocentrism does not inherently spread, whereas tribalism does–via killing off those who are not tribal. More examples:

  • Any belief that creates advantages in war
    • An emphasis on science & technology. Between two countries all-else-equal, the technology-loving country has an advantage.
    • Nationalism and strong national identities. This should work in similar ways to tribalism.
    • Policies like having a standing army or draft.
  • Racism in the old-fashioned way–straight-up “people of X color are subhuman/shouldn’t exist”. This is essentially the same example as tribalism.
  • Family centrism. This is more of a biological trait than a psychological one, but I’ll mention it here. Suppose 50% of people would sacrifice the lives of two strangers to save their child, and the other 50% would sacrifice their child to save the lives of two strangers. Assuming there is some genetic component to this belief, you’d expect the population to converge to 100% of the population being willing to sacrifice two strangers to save their own child, because that gene would be selected.
  • Growth-oriented ideas
    • “Have lots of children” is an obvious one. If 50% of the population believed everyone should have lots of children, and 50% believed no one should have children, what % of the population will have each belief in 100 years?
    • Mainstream economics. Given that you’re reading this, you are likely living in an above-average wealthy country, and wealth countries tend to have strong growth policies.
    • Countries which prioritize growth over sustainability gain a military advantage, in addition to directly increasing the % of population that supports growth.
    • “My country shouldn’t worry about climate change”–A country that worries a lot about climate change needs to sacrifice growth, thus putting it at a disadvantage compared to other countries, and after some time it could lose % population of the world, and also it might have economic troubles that cause ideas from rich countries which don’t care about climate change to seep in.
  • Anti-euthanasia. We take this for granted, but “You should live your life, even if you are suffering” is an evolutionarily advantaged belief. Let’s say there is a disease so permanently crippling and painful that 90% who get it really, really beg to be euthanized (and somehow succeed in convincing their doctors), while the other 10% still experience pain but really, really believe in suffering through the pain. Now if you conduct a poll on “Is this disease so bad you’d want to die? Let’s ask some patients and find out”, you’d find that a large percentage wants to carry out living.
  • (Abrahamic) Religion
    • The punishment for apostasy can range from social stigma to death, deterring people from believing competing ideas. There is also the threat of eternal suffering for nonbelief.
    • The first three commandments are about deterring people from thinking about competing ideas.
    • Religions tend to have some form of evangelism.
    • “Be fruitful and multiply” is growth-oriented.
  • Simple, easy-to-explain ideas. It is easy to spread simple ideas, difficult to spread complex ones.
  • Ideas that human brains are particularly good at remembering. E.g., a catchy slogan or song.

In general, I think we should be marginally more skeptical of all of these ideas. They are popular ideas, not necessarily because they are right, but because they have beneficial selection traits. The idea could still be right, just not because “a bunch of other people believe this idea, so it must has a high likelihood to be correct.”

Evolutionarily Disadvantaged Ideas

The converse is that we should be more accepting of evolutionarily disadvantaged ideas, or evolutionary dead-ends. A very basic list is just the opposite of the previous:

  • Ideas that don’t lead to strong militaries, e.g. not focusing so much on science and technology
  • Treating all humans equally. This sounds obvious and easy, but it is really not! Who would value a stranger’s child as equal to their own child?
  • Sustainability-oriented ideas, or even population/economic-shrinking ideas, as opposed to permanent growth.
    • Antinatalism. Already, more people especially in the west are choosing to be childfree.
    • Environmentalism. Note the most radical forms like the Voluntary Human Extinction Movement.
  • Euthanasia
    • More strongly, suicide. Suicide is the most extreme evolutionary dead-end. Yet a lot of people commit suicide every year. Maybe the idea that life sucks/isn’t worth living is more valid than people give it credit for, and a lot of people needlessly suffer their entire lives. It is hard to have a good two-sided discussion between two opposing sides because the people most agreeing with the idea of suicide are dead. Of course, raising the status of this is a social danger because it would cause more people to die of suicide.
  • Anti-religion. Note this mostly applies to the Abrahamic religions. Buddhism is kind of a weird one because it is somewhat antinatalist, so we would have expected it to be selected out of the population.
  • Complex, hard-to-understand, hard-to-remember ideas.

Final Thoughts

To correct for selection, we should marginally lower the acceptance of advantaged ideas and raise the acceptance of disadvantaged ideas. And when considering which ideas are the most popular, we need to make sure we’re not falling to selection effects.

A future post will contain a counterargument to all this–why we shouldn’t care about idea selection and just use whatever ideas are easy to propagate.