As an Amazon Associate I earn from qualifying purchases.

covid birthday paradox risk model

The holidays are upon us which means a time of celebrations with friends and family. Well, it would mean that in any other year. Unfortunately there is still a pandemic spreading across the country despite how some of your friends and family are acting on social media. You need to stay home this year and I brought the math on why using the COVID Birthday Paradox.

The problem is that we are all notoriously bad at calculating the actual risk we take when participating in our lives. All of us are. We grossly underestimate the external risk, overestimate our ability to personally mitigate it, and otherwise assume “it won’t happen to me”. The COVID Birthday Paradox calculation helps show the real risk of being in the room with a COVID positive person.

All in this together (well… some of us)

A person’s response to the COVID-19 pandemic seems to fall into three categories. First, the very safe people who are staying home, having goods delivered, and otherwise limiting contact with anyone outside their bubble. Second, are the people who *think* they are being safe while participating in risky activities. Finally, the group that doesn’t seem to care that a pandemic is still very much happening.

I sincerely thank those in the first category for doing their part in slowing the spread of the virus. We honestly can’t help the last group who is pretending none of this is real and going about their life. So I wrote this code and explanation for the second group who believes they are being safe.

I don’t blame this group… entirely

It isn’t entirely their fault. They are constantly hearing seemingly low positive testing rates and assuming that is their total risk when venturing out in the world. They are wrong, but I can see how they got to that conclusion. The news reports a current covid positive rate of 8% and you believe you have a 92% chance of staying safe and that seems more than enough. I get it, we all take risks every day.

The problem is that we don’t interact with only one other person while dining out or shopping. Each new interaction or person in the room compounds the probability that at least one person has COVID (whether they know it or not).

To make it easier to understand, we can use the famous Birthday Paradox problem in statistics. Global pandemics are new and hard to wrap our minds around. Trying to guess the odds that two people in a room share a birthday is something we all have in common. Do you know how many people you need in a room to have a 90% chance two people share a birthday? The answer might surprise you.

The Birthday Paradox

The birthday problem in statistics is a fascinating thought experiment. You might’ve heard of it but never dove into the math behind it. I know I didn’t until taking a graduate level data science course on statistics. The premise is simple even if the answer isn’t.

The question is how many people need to be in a room for the probability that two people share a birthday is over 90%. The math for that calculation is surprisingly complex because there are so many combinations of people in the room. There are a few issues when considering this problem in our heads.

Issue 1: we are self-centered

Think of a room with six people and consider the odds of a shared birthday. Are you only thinking of the five comparisons between yourself and the other people in the room? People are selfish by nature and don’t initially consider all of the other possible matches in the room.

birthday paradox selfish

The real pairing map for a room of six people looks like the one below. There are 15 possible pairings that could have the same birthday even in this tiny room. That is 3 times more than the self centered calculation we all do in our head.

birthday paradox problem reality

The disparity between the self-centered count and the real count only magnifies in larger groups of people. In a room with 23 people there are 231 total pairings versus the 22 comparisons to our own birthday. The real calculation is ten times higher than we realize!

Second Issue: Exponents aren’t intuitive

We all took math in school and some of us more classes than others. The issue is that even with all of the schooling in the world on math some concepts are just not intuitive to us. Imagining exponential growth is just not something that comes easy to us. We know it happens but is always a little shocking how quickly the numbers get out of control.

Ok. I believe you, it is hard to calculate

The straight forward calculation of a shared birthday in the room is a mess. You’d have to calculate the odds that every pair of people in the room might share birthdays. Then you’d have to determine how each pair’s odds relate to the overall odds in a room with that many people. I’m lazy so let’s do it another way.

For the outcome to be true, we only need one of the pairs to have a shared birthday. Since that is the case we could simply calculate the odds that *nobody* in the room shares a birthday and subtract that from 100%. Essentially, if there is a 10% chance that nobody in the room shares a birthday with anyone else then there is a 90% chance a match exists.

Spoilers: Fewer people than you think

birthday paradox problem curve

Did you already cheat and look up how many people need to be in a room for there to be a 90% chance two people share a birthday? In case you didn’t the answer is 42 people. Just 42 people. Hence why it is referred to as a paradox even though the math is very sound. The result is counter intuitive because you’d rightfully assume you’d need far more people in that room.

Your brain wants to believe the odds of any match is 1/365 which is a very small percent chance. When we encounter this in the wild it seems very coincidental because we forget just how many other pairs existed that did not match in the room. We tend to remember the one match.

How does COVID fit in?

Getting back to the COVID Birthday Paradox problem and why I went through the explanation above. The base calculation is to first determine the probability that nobody in the room has a birthday match.

The probability that someone in a room has COVID is basically the same problem. First we calculate the probability that nobody in the room has COVID using the current positive testing rate metric publicly available. Once we have that the risk that at least one person in the room has the virus is 100% minus the probability nobody in the room has COVID.

Simple COVID Birthday Paradox calculation: You are in a room with one other person and the current positive testing rate is 8% where you live. Since there is only one person the calculation that nobody has COVID is easy at 92%. The overall risk in that scenario would be 100% – 92% = 8% probability that someone in that room has COVID.

Python code to model scenarios

I could calculate this out by hand with my trusty TI-83 but there is no fun in that. I taught myself to code in python a while back and I plan on shoehorning that into as much of my work as possible. As Abraham Maslow said “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.” haha

Code to calculate COVID Birthday Paradox

curr_FL_pos_rate = 0.0789  #as of 12/22/2020
print('Source: Current Positive Rate in Florida')
print('{x:.2%} --> Current known Positive Test Rate'.format(x=curr_FL_pos_rate))

for people_count in [20, 50, 100, 200]:
    print('\n--------------------')
    print(f'Probability that at least 1 person in a\n room of {people_count} people has COVID-19')
    for i in [10, 4, 2]:
        covid_probability = 1 - ((1-(7.89/(100*i))) ** people_count)
        print(f"    {covid_probability:.2%} --> If rate is really {curr_FL_pos_rate/i:.2%} (1/{i})")
    print(f"    {covid_probability:.2%} --> If rate is really {curr_FL_pos_rate:.2%}")

Code Output

Source: Current Positive Rate in Florida
7.89% --> Current known Positive Test Rate

--------------------
Probability that at least 1 person in a
 room of 20 people has COVID-19
    14.65% --> If rate is really 0.79% (1/10)
    32.86% --> If rate is really 1.97% (1/4)
    55.29% --> If rate is really 3.94% (1/2)
    55.29% --> If rate is really 7.89%

--------------------
Probability that at least 1 person in a
 room of 50 people has COVID-19
    32.70% --> If rate is really 0.79% (1/10)
    63.07% --> If rate is really 1.97% (1/4)
    86.63% --> If rate is really 3.94% (1/2)
    86.63% --> If rate is really 7.89%

--------------------
Probability that at least 1 person in a
 room of 100 people has COVID-19
    54.71% --> If rate is really 0.79% (1/10)
    86.36% --> If rate is really 1.97% (1/4)
    98.21% --> If rate is really 3.94% (1/2)
    98.21% --> If rate is really 7.89%

--------------------
Probability that at least 1 person in a
 room of 200 people has COVID-19
    79.49% --> If rate is really 0.79% (1/10)
    98.14% --> If rate is really 1.97% (1/4)
    99.97% --> If rate is really 3.94% (1/2)
    99.97% --> If rate is really 7.89%

What does that mean? This code uses the current Florida positive test result percent as of 12/22/2020 per the FL Department of Health dashboard. The rate was chosen because it represents the percent of the population that might have COVID right now (which is what is important for current risk).

I didn’t want to use the total infected in the population because many of them might have recovered or otherwise no longer be infectious. We only want the probability that someone has COVID right now in our local area to assess risk.

covid birthday paradox problem curve current

Wait. Isn’t that sample biased?

I thought you might say that. The code also calculates the probability if the current positive test rate is over representing the total population. It stands to reason that those getting tested either came in contact or are higher risk to start so using that figure might be biased sampling.

To account for that concern, I included 3 additional results: 1/2, 1/4, and 1/10. These run the same calculations but assume that for every person tested there are X healthy people in the population that had no need to be tested. For example the (1/4) line presumes that for every 1 person that did get tested there are 3 healthy people who did not get tested. In that example an 8% positive test rate would be modified from 8 out of 100 to 8 out of 400 and use 2%.

covid birthday paradox problem curve 4x lower

COVID Birthday Paradox Scenarios

You’ll notice that the code also cranks out the risk probability of being in the room with a COVID positive person for several room sizes as well. For those of you out there still going about your life either apathetic or unaware of the pandemic dangers, I included a few common scenarios.

This only shows the probability that another person in the room/building is COVID positive and not your percent chance of catching the virus. It doesn’t take into account that a restaurant where diners are not wearing masks is far riskier for the same number of people than masked occupants. The scenarios don’t adjust for situations that are inherently more risky for an airborne pathogen. Places with poor air circulation, events where singing is involved,

I didn’t just randomly pick these numbers. They represent real world scenarios you might find yourself in this holiday season. These are all indoor situations (where the risk is higher than outdoors) a person could be considering this or next week.

Breakdown by Venue Size

Here is a quick breakdown of the probability at least one person in the building has COVID using the 1/4 current rate (to be conservative). I included some examples too for illustration.

  • 20 person room
    • 33% Probability that at least one person has COVID
    • Examples
      • Family gathering
      • Small party with friends
  • 50 person room
    • 63% Probability that at least one person has COVID
    • Examples
      • Dining out in a restaurant
      • Drinking at a bar
      • Seeing a movie in the theater
  • 100 person room
    • 86% Probability that at least one person has COVID
    • Examples
      • Grocery Store (shopping inside)
      • In person Church service

TL;DR

Stay home. It’ll be ok if we have to have one holiday season over Zoom since there will hopefully be many more to come. I know a current positive test rate under 10% seems like a low risk, but you are bad at math. The real risk is significantly higher once you get in a room with more people. In a 50 person room (like dining out at a restaurant) there is a greater than 60% probability someone in the room has COVID or has been recently exposed.

Get take out instead and eat at home. For your sake and everyone else.

Happy Holidays

Amazon and the Amazon logo are trademarks of Amazon.com, Inc, or its affiliates.