As COVID-19 began to take off in the U.S. early this year, federal, state, and county governments hurried to contain it. Policies restricting movement, closing businesses, etc., have become a staple of government policy, including many stay-at-home directives across the U.S. These measures have been met with varying levels of enthusiasm, but they are rooted in an understanding about how epidemic diseases spread. Now, with several months of data, we can begin to test whether these policies have been effective in their primary function and to what extent a controlled “reopening” raises a community’s COVID-19 risk profile.
Directly measuring policy effects is a difficult task, but Elder Research was recently given the opportunity to take this on. During a two-week effort supporting the U.K. government in partnership with the Emergent Alliance, our team set out to accomplish two goals: first, compile a data set at the U.S. county level suitable for COVID-19 analytics, and, then, examine at the county level to what extent U.S. policies have slowed COVID-19’s spread and “reopening” reintroduces the risk of disease spread.
This project was delivered in June just prior to a new spike in case reports, and this post is limited to data reported between January 29 and June 8, 2020. Our analysis led to two conclusions:
- Stay-at-home orders were effective at slowing COVID-19.
- Phased re-openings provided a much weaker braking effect on the disease.
The source code and data to replicate this post are available from our GitLab repository, including the code needed to train the model. Original source code for our work with the Emergent Alliance is also available on GitHub.
At the beginning of the project, we started by considering what data sources were available at what resolution—state, county, or federal level? How often were these data sets updated? By connecting government action with case counts and other potentially importance factors, we aimed to quantify the impact of policy on disease progression. State and local governments adopted varied COVID-19 policies, and at different times (data resolution); we assembled a collection of government actions across 3,142 counties, 2,985 of which have reported at least one confirmed COVID-19 case and so are included in our study. By modeling at the county level, rather than by state or country, we can take advantage of the geographic, demographic, and political diversity across the United States. Figure 1 shows a wide range of reported COVID-19 prevalences across the U.S.
Figure 1: Cumulative COVID-19 cases per county through June 8, 2020.
We also observe that the states and counties have enacted restrictions at different times (Figure 2) and with varying severities (Figure 3). Figures 2 and 3 also clearly demonstrate important state-level effects, which we incorporate into our model.
Figure 2: Date of the first enacted policy (e.g., “Stay Home”) per county as of June 8, 2020. Unshaded counties had no recorded policy.
Figure 3: The strictest policy enacted per county as of June 8, 2020. Unshaded counties had no data.
Data Sources and Model Inputs
Government policy alone does not provide enough information to build a useful model; there might be dozens of factors influencing COVID-19’s spread. To incorporate some of the most important factors, our team assembled a diverse collection of data sets, including COVID-19 reporting statistics, policy timings, county demographics, weather readings, and COVID-19 testing statistics. A full data listing, including URLs, is available here; our primary sources include:
- Centers for Disease Control and Prevention COVID-19 case data
- National Weather Services, Automated Surface Observing System (ASOS)
- U.S. Census Bureau American Community Survey (ACS)
- COVID-19 tracking data
- U.S. Land Area Data
- Public Health England U.S. summary statistics
From these sources we chose the week-over-week growth in COVID-19 cases per county as our target variable:
This quantity, y(t) , describes the disease’s growth rate instead of the total number of cases. It can take on any value y(t) ≥ 0, where y(t) = 0 implies the disease has been completely stalled, as no new cases were recorded in the prior week. We are interested in whether government policy drives y(t) → 0. Modeling the weekly difference in confirmed cases also offers the advantage of automatically removing any day-of-week seasonality in the case reporting data. This is important given the strong weekly seasonality in case reports we have written about previously.
Given our collected data and target variable, we constructed an Analytics Base Table (ABT) that includes the target, a collection of covariates, and governmental policy indicators. Our policy indicators are binned into five groups: “no policy,” “stay home” orders, plus three “reopening” phases:
- Phase 1: Lower-Risk Workplaces – Retail may reopen with capacity restrictions, child care facilities, manufacturing and logistics, offices, limited hospitality and personal services, restaurants (take-away only), and public places (parks, beaches, etc.)
- Phase 2: Medium-Risk Workplaces – Movie theaters, bars, dine-in restaurants, gyms, religious services, and more personal and hospitality services.
- Phase 3: High-Risk Areas – Essentially move back to pre-quarantine society, although concerts, conventions, and sports arenas may have capacity limits.
Our final ABT includes the following elements:
As our standard practice, we passed a random sample of 1,000 counties (71,637 observations, nearly one-third of our data) through the Boruta feature-selection algorithm to verify their utility in a predictive model. As shown in Figure 4, all candidate features are acceptable from Boruta’s permutation-test perspective. Each predictor is at least potentially useful (confirmed) to the model, with the possible exception of the relative humidity (tentative).
Figure 4: Outcome of the Boruta feature selection algorithm. Candidate features are labeled as Confirmed or Tentative based upon their performance. Model performance with randomized baseline features is captured by shadowMin, shadowMean, and shadowMax.
A final check confirms that, while our data set contains several correlated features, none of these correlations are strong enough to warrant excluding an input from the model.
Regarding Mobility Data
We specifically excluded popular mobility data sets from this model because these measures are likely to be strongly impacted by policy; directly including them in our model would risk masking these policy effects. (See, e.g., the excellent book Statistical Rethinking, which includes a relevant discussion of post-treatment bias.) Investigating other approaches was not possible at this time, but including this data in a principled way could boost a future version of the model.
Modeling and Validation
Given our compiled data set, we chose to fit a multilevel, count regression model to predict the week-over-week growth in confirmed COVID-19 cases. The model is implemented using the R package glmmTMB, and the counts are modeled using a negative-binomial distribution, which allows for overdispersion. We also chose a multilevel model to better account for state and county-level influences on the outcome, considering each locality in the context of the whole.
Our fitted model parameters are displayed below in Figure 5; following Gelman (2007), we have standardized our inputs in such a way that both discrete and continuous predictors are presented on similar scales. Notice that, as a consequence of our modeling architecture, coefficients are reported as multiplicative factors instead of additive ones. In this framework, an Incident Rate Ratio (IRR) of 1 implies no effect, since multiplying by 1 does not change the result. In contrast, a value IRR > 1 implies an increasing rate of COVID-19 cases, and IRR < 1 implies a decreasing case rate.
Figure 5: Fitted population-level coefficients from our COVID-19 model. Incidence Rate Ratios (IRRs) > 1 imply a multiplicative effect (COVID-19 grows faster), and IRRs < 1 imply the opposite.
To help validate our model for inference, we fit the same model under 5-fold cross validation, grouped by county so that each county only appears in a single fold. Encouragingly, the model coefficients are quite stable across the folds as shown in Figure 6.
Figure 6: Fitted population-level coefficients under five-fold cross validation. Point estimates and error bars from each fold are overlaid atop one another and in many cases are indistinguishable.
Our model coefficients capture average behaviors across the U.S., but the model can also make inferences at the county level each day. Figure 7 shows COVID-19 case trajectories and model fits for four counties in the eastern U.S. This figure illustrates several different policy timelines: Arlington County had just moved into a Phase 1 posture in June, while Wake County was well into Phase 2. None of these counties had yet made it to Phase 3.
Figure 7: Week-over-week growth in COVID-19 cases [y(t)] over time. The model fit is overlaid in white, and the ground-truth observations are displayed as black lines. Policy status is shaded into the background.
At the risk of visually overfitting, we see two distinct disease trajectories in Figure 7: Arlington and Ann Arundel counties demonstrate an initial growth in COVID-19 case rates that continues during the stay-at-home order before beginning to level off. Albemarle and Wake counties show a much slower initial spread of the disease followed by a large acceleration during Phase 1 of reopening.
Confident that we have achieved a reasonably stable model fit, and that our model describes the data, we ask how to interpret the model coefficients. Referring again to Figure 5, coefficients related to policy (“Stay Home” through “None Issued : Days”) are shown to be important in the model because they have IRR values different from 1. These coefficients generally lean in the direction of disease suppression. On the other hand, several COVID-testing and demographics inputs, including total population and minority representation, are associated with increasing case counts. The positive association with testing isn’t surprising: without tests we won’t capture many confirmed cases. Similarly, a county’s total population sets the scale for the target. The result that minority representation is linked to increasing rates COVID-19 is consistent with other sources, but it still raises questions given the other demographic and socioeconomic factors “controlled for” in the model.
Focusing even more tightly on policy measures and starting from the left of Figure 5, we see that the IRRs for the Phase 1 through 3 variables are all less than 1. On average, these policies are correlated with a slower increase in COVID-19 reports, approximately cutting the growth rate in half. Taking this interpretation too far can be misleading, though, as shown by the “Stay Home” coefficient, which correlates with an increase in the number of COVID-19 cases week over week.
How can we explain stay-at-home orders being related to higher numbers of new cases? One answer is that this coefficient is linked to the rapid growth of case counts early on, the reason stay-at-home orders were enacted in the first place. Because the coefficient only captures the average effect of the policy, and because one might expect a policy to take some weeks to show results, it carries some weight from the prior period.
Studying how policy variables interact with time (“Stay Home : Days” through “None Issued : Days”) provides further evidence for this line of thinking and demonstrates how policy can change the trajectory of the disease. Interactions are notoriously tricky to interpret, but these terms roughly quantify how the COVID-19 case rate changes over time while a given policy is active. That is, does COVID-19 accelerate the longer a policy is active or does it decelerate? We find a reasonably strong effect: stay-at-home orders correlate with decreasing rates of COVID-19 over time, while, on the other extreme, taking no action (“None Issued”) is correlated with increasing COVID-19 rates.
Figure 8 demonstrates this idea by plotting the effects of these coefficients with all other variables held at their mean values.
Figure 8: Interaction plot showing the impact of the interaction between policy and time in the policy on the target, y(t). Note that the various panels are plotted on different scales for clarity, and that some values are extrapolated: not all policies are guaranteed to have been in place for 125 days.
Phases 1–3 appear to keep the COVID-19 growth rates roughly constant (that is, the number of new cases per week is roughly stable or slightly decreasing), while more substantial effects are seen in the other two policy levels. Stay-at-home measures are correlated, on average, with decreasing the number of new COVID cases week over week, while taking no action is correlated with the observed increase in COVID-19 cases.
Having incorporated a number of demographic, weather, and testing variables into our model, this analysis suggests that, accounting for their influence, stay-at-home orders have been effective at slowing the spread of COVID-19, with the number of new COVID-19 cases decreasing week over week on average. This analysis also indicates that a phased approach to reopening offers a weaker slowing of the disease.
This study was concluded in mid-June, just as many counties were moving into later stages of reopening and case counts began to rise again. Only 3% of our data represents Phase 3 policy actions, and refitting our model with even one additional week’s data suggests that Phase 3 might be associated with positive COVID case growth. Continued monitoring will prove essential as conditions, policy, and adherence to policy change over time.
NOTE: This project included major contributions from Elder Research staff including Ramon Perez, Carlos Blancarte, Andrew Stewart, Mike Thurber, Wayne Folta and Publicis Sapient AI Labs staff Carl Norman and Mark Smith.