JSvg: Javascript Analysis and Vector Graphics: June 2020

In the last post we discussed the efficacy of mitigating strategies to prevent the spread of COVID-19. Based on the work of Chu et al(2020) who reviewed a number of other studies, both mask usage and social distancing are relatively effective means of preventing the spread of COVID-19. However, neither are 100% effective, and as such, even when taking such precautions, it is possible to acquire COVID-19 with enough exposure.

In this post, we will attempt to estimate the “true” exposure rate based on recorded rates of COVID-19 in the population. Additionally, we will start looking at how repeated daily exposure to the same group of people can still spread the disease, given that the group is large enough.

There are a number of places one can go to get estimates of both population and rate of infection. Google now will pop up a nice graphic if you do a search for something like: “COVID-19 PA case count”

Below it are further links to state and county dashboards. The state dashboard currently (June 26) shows 81,374 cases for a population of 12.7 million. This is a case rate of

percentInfected = 81374/12.7e6 = 0.006407 = 0.641%

According to the State of Pennsylvania’s COVID-19 dashboard 78% of these cases are recovered. What does that mean?

“Individuals who have recovered is determined using a calculation, similar to what is being done by several other states. If a case has not been reported as a death, and it is more than 30 days past the date of their first positive test (or onset of symptoms) then an individual is considered recovered.”

This is an attempt to quantify the number of active cases (22% of 81,374 = 17,902) which can then be used as a better estimate of contagion. Instead of considering the percent infected, we consider the percent who are contagious:

percentContagious = 17902/12.7e6 = 0.00141 = 0.141%

That said, it may be better to work on a more local level since there are some parts of the state that have a much lower rate of contagion, and others that have a much higher rate. While there is (sadly) still a fair bit of travel in between, a first order approximation should work with the local data as it is likely to be more relevant. Let’s consider Chester County, PA since that’s where I am. Chester County is not that far outside of Philadelphia, which has been a COVID-19 hotspot.

Let’s calculate the infection rate for Chester County. To do so we need the number of cases, and the total population. I’m not sure where the state got it’s population count for Chester County. According to Google it’s 524,989. If we use this case count and Google’s population count, we get an incidence rate of

percentInfected = 3578/524989 = 0.00681 => 0.681%

which translates to 681 cases per 100,000 population. This is really close to what the above graphic shows so we’ll go with the official count of 685 per 100,000. How many have recovered? Of these, 64.45% have recovered, leaving 35.55% potentially infectious. Thus, we get a contagion rate of

percentContagious = 0.3555*0.00681 = 0.00242 = 0.242%.

We could even get more specific and look at West Chester Borough statistics

According to the Chester County dashboard, the population of the borough is 20,048, and there have been 126 cases overall with 34 in the last 30 days. This makes the contagious percentage:

percentContagious = 34/20048 = 0.00169 = 0.169%

This is slightly higher than the rate for Pennsylvania at large, but much smaller than Chester County as a whole.

The percent recovered stat is an attempt to quantify potential load on the health-care system, but also the infectiousness moving forward. The 30 day window is not perfect, but it’s a start. In the next post, we’ll look at better estimates of contagion

Now we have enough information to start building our model at either the state, county, or local level. The model itself will have several assumptions, and its goal is to give us the sense of how day to day contacts within a single group of a certain size may contribute to the spread of COVID-19. We will run the model for each condition – unprotected, with a mask, with social distancing, and with mask and social distancing. The presumption is that ALL group members are following the condition specified. We’ll see later how to partition the group based on protective measures, but that introduces too much complexity for the moment. So the key assumptions are as follows:

* Your chance of encountering an infected person in the group at the start of it all is based on the population stats calculated above, for example 0.00169 if you are considering West Chester Borough
Your chance of encountering an infected person will only go up from there if members of the group share the infection, and if they don’t, then it will still be the initial percentage.
Given that you encounter an infected person, your chance of getting infected yourself is 0.022 (2.2%) from the Luo et al (2020) paper. Furthermore, this assumes that the group is made up of 18-44 year olds.
Your risk is reduced by wearing a mask (odds-ratio = 0.15)
Your risk is reduced by social distancing by at least a meter (odds-ratio = 0.18).
A person who is infected today can infect someone tomorrow (not real, but we’ll deal with this in an upcoming post)
Everyone in the group stays in the group (again, not real – since once someone starts getting sick, they will probably leave the group. We’ll deal with this later also).

The approach we will take to this is called Monte Carlo modeling, so named because of the plethora of casinos in Monte Carlo. The idea is to run the model some number of times and let the results of random number generators be compared with the stated probabilities in order to estimate whether or not people get infected. We then compute the average of the results and the standard deviation (an estimate of the uncertainty) to get a sense of how bad things get.

How does such a model deal with chance? Most computer languages have a random number generator function that will spit out a different random number every time they are run. In JavaScript, which we will use here, that function is Math.random(). The function will randomly spit out a number between 0 and 1 with 16 digits of precision. If you have a probability of 0.022 of catching COVID-19 from a an infected friend in your group, the random number generator would have to spit out a number less than 0.022 in order for you to be considered infected. In fact, we would run this test every day for each member of the group. In a group of 20 people, Math.random() would be run 19 times in a given day order to estimate if the infection spread to any of the 20 people. Every day and for every person and iteration of the model, we run the function below to get a sense of whether we have added to the number of infected people.

function checkInfections(prob1, prob2, prob3, prob4) {

	// See if individual is infected this round
	// prob1 - No precautions
	// prob2 - With Mask
	// prob3 - With Social Distancing
	// prob4 - With Mask+Social Distancing
	// Additional number infected (in the usual order)
	var N1 = 0;	
	var N2 = 0;
	var N3 = 0;
	var N4 = 0;
	ran = Math.random();	// Generate the random number
	// Then compare it to each probability in turn
	if (ran < prob1) {
		N1++;
	} 

	if (ran < prob2) {
		N2++;
	} 

	if (ran < prob3) {
		N3++;
	} 		
	if (ran < prob4) {
		N4++;
	} 			
	// Return the number of additional infected in each category
	return [N1, N2, N3, N4];
}

We need to have a way to calculate the initial probabilities of infection that go into this function. This will be slightly different than what was done previously because previously, we assumed that EVERYONE we came into contact with was infected and contagious. This is no longer true. So instead of calculating our probability of infection as:

N = 10; // Number of exposures

p1 = 0.022 // probability of getting COVID-19 from close contact

pN = 1 – p1 // probability of NOT getting it

pInfected = (1 - pN**N)

we now have to do the following:

N = 10; // Number of exposures

p = 0.022 // probability of getting COVID-19 from close contactpropInfected = 0.00169; // proportion infected people in your population

propNinf = 1 – propInfected; // proportion of people NOT infected in population

pExposure = (1 – propNinf**N);

pInfected = p*pExposure

Wait! Why haven’t we calculated pInfected in the same way? Why not just use our prior result and then multiply it by the probability of encountering someone in the population? This is because our earlier calculation was the probability of being infected if encountering N known infected people. As a matter of fact, we are not in that situation, we are in a situation where we are encountering N people who may or may not be infected. Therefore, we first calculate a probability of exposure due to N encounters and then multiply it by the chance of becoming infected due to a single encounter. So if there are 10 people in our group, using the above math:

propNinf = 0.99831 (99.8% no)

pExposure = 0.016772 (1.67% chance of exposure)

pInfected = 0.0003689 (0.03% chance of infection with no precautions)

Let’s be clear about what this means. This 0.03% chance of being infected if you live life as normal seems pretty small, except when you start actually mixing with the usual number of people. If you simply count close contacts (including due to air flow) from your shopping trip (N=30?), your training at the gym (N=20?), your working day (for me N = 100), your trip to the pub afterwards (N=30), your net number of contacts is now perhaps as high as 150-200. Let’s rerun that calculation on infection probability for 150 contacts.

pExposure = (1 – 0.99831**150) = 0.2241 = (22.41%)

pInfected = 0.022*0.2241 = 0.00493 = (0.493%)

So your chance of exposure goes up to almost 1 in 5 people overall, but the chance of infection is still, in theory, about half a percent. Those are good odds, right? But what if everyone of those 150 people has the same odds? What’s the chance that at least one of you will get COVID-19? Note that we can calculate the effect of wearing a mask or social distancing as before:

p2 = 0.15 // Odds-ratio for mask

p3 = 0.18; // Odds-ratio for social distancing

pWithMask = p2*pInfected; // Probability of being infected while wearing a mask

pWithSocialDist = p3*pInfected; // Probability of being infected while social distancing

pWithMaskPlusDist = p2*p3*pInfected; // Probability of being infected with both

In any event, even if we isolate a group except for going home, carefully getting gas or picking up groceries, or hanging out with family members who may have commitments outside of the home, there is the off chance that someone additional gets infected, and then returns to the group the next day. Therefore, the exposure rate will go up. Here’s the way to think of it. Let’s call the initial proportion of those infected who are contagious: pInfected. For West Chester:

pInfected = 0.00169

at the start of it all. Let’s say we have 10 people in the group, and one of them was unlucky enough to get infected outside our group since we last met up. This means that we have the outside population-wide proportion of people who are infected and contagious (0.00169) plus our new source of infection from within our group, which now, if self-contained, has a contagious proportion of (1/10 = 0.10). The new net exposure to infected people is now 0.10 + 0.00169 = 0.10169 (yikes!). Therefore the exposure rate, which controls the other probabilities will likely go up from day to day.

So the bottom line in our model is that we need to randomly test everyone every day to see if they have succumbed, our risk is based on both population statistics of infection (to start with), but then also changes within our group, and finally, we need to run the model a bunch of times (since it is random) to determine just how stable our results are over time.

A first run of the model with a group of 50 people and an initial proportion of infected population of 0.00169 (0.169%) leads to the following outcomes.

The solid lines in the graph show averages over the number of runs, while the dotted lines indicate the standard deviations. These results show that with the inputs as noted, after about 35 days, about half of the group is affected if no-one takes precautions. Taking precautions clearly makes for a better result.

But what if your exposure to a group is larger in that there are more people? Perhaps during your day, you come into contact with 100 people rather than 50. These results are shown below:

In this case, the spread of results is lower, but the timeline is sped up – the 50% infection mark for no precautions is now at 25 days instead of 35 days. At the end of 45 days, 90.1% of those who don’t take precautions are infected, and about 1% (0.98%) of those who social distance + wear masks are also infected.

Finally, let’s consider the possibility that the CDC is correct in their recent announcement that the national infection rate is ten times higher than what is published in the statistics. If that is true, then instead of an initial proportion of infected/contagious of 0.00169 (0.169%), it could be more like 0.0169 (1.69%). With these inputs our model looks like this:

The timeline is sped up again with the unprotected reaching a 50% infection rate by day 21-22. At the end of 45 days, the unprotected are 100% infected, and even those wearing masks and social distancing both are, on average 4.7% infected.

As noted before, there are a number of assumptions here which are not quite true. Likewise, in real groups we don’t have 100% of people doing only one thing. Rather there are likely to be a proportion of people wearing masks, social distancing, or throwing caution to the wind. In the next post, we’ll look at how to deal with these factors in order to get an even better estimate of how COVID-19 is spread. In fact, we’ll be able to see just how much of a problem it can be if you’ve got just that one person who refuses to wear a mask or social distance.

Despite the shortcomings of this model, it does highlight how quickly contagion of this nature can spread. Therefore, these results should be alarming, especially since they are quite sensitive to initial conditions (active infection rate) which we can only guess at. Even with a small initial proportion, the rate of infected in our group increases rapidly, especially for the unprotected.

Full Code below the jump...

JSvg: Javascript Analysis and Vector Graphics

Sunday, June 28, 2020

Modeling the Spread of COVID-19: Part 4 - Real Exposure