Tuesday, June 23, 2020

Modeling the Spread of COVID-19: Part 2 - What are my chances?

Many universities and schools are planning to go back into session in person in the fall.  In a later post, we'll look at why human nature says this is a Very Bad Idea, but for now, let's just get familiar with the statistics involved with understanding the spread of infection.  In this post we're not going to be too complicated in our modeling.  For example, we aren't going to get into R values, incubation times, or such, we're going to focus on pure probabilities and then see how that plays out with our chances of getting infected.

OK, if you've read this far, you've probably got a passing interest either in COVID-19 or probability.  Before we delve into the world of disease, let's first think about one of the common examples used to teach and illustrate probability, the roll of a 6-sided die.  Since it has 6-sides, if it is a fair die, then your probability of rolling any number, say 1, is 1 chance out of 6, or 1/6 = 0.166666... To roll a one twice in a row is, we can agree, more difficult.  Its probability should be lower.  One can find the probability making a table of possible rolls and counting up the possibilities that present themselves.  Below [1,1] stands for a roll of 1 followed by a second roll of 1.
  • [1,1], [1,2], [1,3], [1,4], [1,5], [1,6]
  • [2,1], [2,2], [2,3], [2,4], [2,5], [2,6]
  • [3,1], [3,2], [3,3], [3,4], [3,5], [3,6]
  • [4,1], [4,2], [4,3], [4,4], [4,5], [4,6]
  • [5,1], [5,2], [5,3], [5,4], [5,5], [5,6]
  • [6,1], [6,2], [6,3], [6,4], [6,5], [6,6]
As you can see, the rolls [1,1] only shows up once out of 36 times, so it can be seen that the probabilities per roll (1/6) just multiply: (1/6)*(1/6) = (1/36).  

So it is much more probably NOT to roll two ones in a row.  How much?  This is 1 - (1/36) = (35/36).  In the same way, the probability of NOT rolling a one the first time is 1 - (1/6) = (5/6).  On the other hand, if we change the question and ask, "what is the probability of rolling a one at least one time?" the odds change.  Per roll, the probability is (1/6), but the ensemble probability is now greater because I would have more tries.   Let's look at that table again, but color in any combination roll which has a one.
  • [1,1], [1,2], [1,3], [1,4], [1,5], [1,6]
  • [2,1], [2,2], [2,3], [2,4], [2,5], [2,6]
  • [3,1], [3,2], [3,3], [3,4], [3,5], [3,6]
  • [4,1], [4,2], [4,3], [4,4], [4,5], [4,6]
  • [5,1], [5,2], [5,3], [5,4], [5,5], [5,6]
  • [6,1], [6,2], [6,3], [6,4], [6,5], [6,6]
In this case, I have a probability of (11/36) of rolling a one at least once in two rolls. To come up with a closed form formula to represent this, we have to think a bit backwards. First, "what is the probability of not rolling a one in one roll?". From before we found that this was 1-(1/6) = (5/6). Next, "what is the probability of not rolling a one in two rolls?" From our previous paragraph we expect that that is (5/6)*(5/6).   Now we can ask, "what is the probability of NOT 'not rolling a one in two rolls'?"  This must be 1 - (5/6)*(5/6) = 36/36 - 25/36 = 11/36.  So we have a way to represent what are called independent probabilities.  The formula for such a prediction reduces (in JavaScript) to:

pOne = 1/6;                // Probability of rolling a one
pNotOne = 1 - pOne;    // Probability of NOT rolling a one
nRolls = 2;                  // Number of rolls
pAtLeastOne = (1 - pNotOne**nRolls)

where the ** operator in JavaScript means "raise to a power".   What you see here in the last formula is an expression that can be reused later.

How does this relate to the spread of disease?  When doing a simple model, one might say that you have X% chance of catching it.  A percentage can be represented in terms of a probability.  Fifty percent is 50/100 or 1/2 or 0.50.  Two percent is 0.02.  You get the idea.  If you search the literature, you can get a sense of just how probable it is to catch COVID-19 from an encounter with someone who has it.  Now let's be clear about this.  Such a value is an average from the analysis of a great many encounters of various durations and conditions which involve everyone from the mildly contagious to superspreaders.  So such probabilities are very rough at best and are good for making general inferences. 

One such paper, Modes of contact and risk of transmission in COVID-19 among close contacts is found on medrxiv.org, a repository for preprints of medically oriented scientific papers.  This group of researchers examined 4950 close contacts in Guangzhou, China and estimated risk of infection based on age group among other things.  Close contact settings were broken into groups including "Cruises", "Public Transport", "Healthcare Settings", "Households", and "Multiple Modes".  Multiple modes".  Of these, there were more cases due to multiple modes of contact (12 out of 92 individuals in the category) than in any other group.  This is approximately a 13% rate of infection.  By age group, the breakdown was as follows:
  • 0-17 years old: 14/783 = 1.8% infection rate
  • 18-44 years old: 51/2338 = 2.2% infection rate
  • 45-59 years old: 29/997 =  2.9% infection rate
  • 60+ years old: 35/824 = 4.2% infection rate
Given these data, we can use the same technique to estimate risk of infection from coming into contact with someone who is infected.  Let's say that you are between 18-44 years of age and have a 2.2% chance of being infected if you are in close contact with an individual with COVID-19.  If you just meet one such person, it's pretty straight forward.  But what if you meet 2 or more?  While your chance of being infected each time will, on average, remain the same, your overall chance of being infected should increase.  You could phrase it as this: "What are my chances of being infected at least once?"  We can use our formula from above.  Instead of writing it like this:

pAtLeastOne = (1 - pNotOne**nRolls)
 
we can write:

pInfectedOnce = (1 - pNoInfection**nContacts)

In this case, pNoInfection = 1 - pInfection = 1 - 0.022 = 0.978 (or a 97.8% chance of not being infected).   Let's run some numbers to see if this behaves the way we think it should - more contacts equals a greater chance of infection.
  • pInfectedOnce(1) = (1 - 0.978**1) = 0.022
  • pInfectedOnce(2) = (1 - 0.978**2) = 0.0435
  • pInfectedOnce(3) = (1 - 0.978**3) = 0.0645
  • pInfectedOnce(4) = (1 - 0.978**4) = 0.0851
  • pInfectedOnce(5) = (1 - 0.978**5) = 0.105 ...
Remember that a probability of 0.105 is a 10.5% chance of being infected.  This is what this function looks like.  By the time we get in close contact with about 35 people we have a 50% chance of contracting COVID-19. [The code to produce this is included below].

All this presumes that 1) the paper's estimation of contagion is correct, 2) we're all in relatively close contact, 3) no-one is taking any precautions.  Under what real world conditions are these likely to be true?  My guess: bars and restaurants, frat parties, and the like.  Where have we seen such uncontrolled spread of the virus?  Where large numbers of people congregate.  In Missouri, for example, there were raucous, uncontrolled parties over Memorial Day weekend.  Look at the increase in case-number about June 14, about 3 weeks after Memorial Day.  I suspect that this trend will continue with time.

And in PA, there's this:

"PHILADELPHIA (CBS) — Twelve people in Bucks County who attended Memorial Day parties at the Jersey Shore have tested positive for the coronavirus. The Bucks County Health Department discovered this cluster of COVID-19 cases through contact tracing. One positive case led to the 11 others. (emphasis added)"

So it's clear that throwing caution to the wind is a real problem.

What about classes?  Well in theory, classes are supposed to start social distancing and wearing masks - which could mitigate the potential for contagion.  In theory, universities are going to try to apply sanctions to students (or faculty, one would hope) who do not take precautions.  In practice, we'll see.  Behavior in a class is one thing, but since most time is spent outside of class, it's not clear what steps universities are willing to take to police this without alienating their paying population.  In the next installment, we'll look at how taking such precautions reduces the likelihood of transmission based on the evidence in the wild.

Code for use in: http://www.niiler.com/JSvg2/analysis.html to produce graph of probability of infection versus number of contacts.

p1 = 0.022    // probability of getting COVID-19 from close contact
p2 = 1 - p1    // probability of NOT get it

N = 50;        // Number of contacts

Ncontacts = [];    // Array to store numbers of contacts (for the x-axis)
pInf = [];        // Array to store the resultant probabilities (for the y-axis)

// Loop through calculating the probability with each number of contacts (i+1)
for (var i=0; i< N; i++) {
  pInf[i] = (1 - p2**(i+1));
  Ncontacts[i] = i+1;
}

// Labels for the graph
optsSeries[0].name = "Model"
optsGraphTitles[0].name = "Number of Contacts";
optsGraphTitles[1].name = "Chance of being infected";
optsGraphTitles[2].name = "";
optsGraphTitles[3].name = "";

// Graph it
drawSeriesPlot([Ncontacts], [pInf], 0,0,'')


No comments:

Post a Comment