Thursday, June 11, 2020

Public database of non-pharmaceutical interventions for COVID-19

This week we published the new paper The effect of large-scale anti-contagion policies on the COVID-19 pandemic which tries to estimate effect of of over 1,700 policies on the day-over-day growth rate of infections. We compiled subnational data across six countries (China, South Korea, Italy, Iran, France, and the USA) and tried to to break down the effects of different overlapping policies.

One result of the paper is that during the period of study, we estimate that there would be roughly 500 million more COVID-19 infections by early April, across these six countries, in the absence of policy. We get to this number by linking up some reduced-form econometric tools from GDP-growth modeling with some SIR/SIER models from epidemiology. If you don't feel like reading a long paper, you can just watch these GIFs of China and Italy (more are here).  Left is what actually happened (area of red circles = cumulative confimred cases), right is a "no policy" simulation:

click to enlarge
click to enlarge

Or if you are more serious about research but still don't want to read the paper, we have two videos explaining the paper here -- one is a presentation at the HELP Seminar, the other is a 3 min summary that involves a lot of figurative and literal arm-waving.

The hardest part of the study was actually putting together an original standardized dataset of all 1,717 NPIs. We have posted all the data here (alongside the code). We are hoping that people will use this new data set for other projects and welcome feedback and/or comments, especially if you think we missed something (much of this was collected by hand, so it is a dataset that will certainly improve over time). The dataset is detailed in the Supplementary Information of the paper. If you use the data, please just cite the article as the source.

We constructed the data set to be at the same spatial resolution of the the most finely resolved case data we could obtain, which was the second administrative level for Italy and China (see figs above), and the first admin level for everyone else. The policies were actually deployed at a variety of admin levels, depending on the country and policy. Here's the table from the appendix summarizing the policies in the data set:

We also worked very hard to standardize policy definitions as much as possible across different countries, e.g. we tried to make "business closure" or "no gathering" mean something similar across different countries, but there is substantial nuance and some things could not be standardized, so please make sure to consult the documentation.

Note that there is also variation in the intensity of policies in the data set. This comes from the fact that many policies are encoded as composites (e.g. there are multiple policies that make up a single policy definition) and also because many policies are not deployed uniformly across an entire administrative unit, so we computed the population-weighted "exposure" to each policy (e.g. if only half the population of a US state adopted a policy, since many are set at the county level, this policy got coded as one-half). Here's a plot of policy intensity over time for the US sample (other plots are available here, but weren't in the paper because we ran out of our figure allotment):

 non-pharmaceutical interventions over time in the USA (click to enlarge) 

Several folks (including a journal referee and someone at the DOD) asked if we planned to keep collecting this data and making it public. In principle, we would love to do that, but the entire team did the whole project pro bono and have since had to go back to our normal jobs. That said, if someone reading this wants to to support/otherwise enable expansion of this data for the public good, you know where to find me, since I won't be going anywhere for a while...

Let us know if you use the data, we'll excited to hear if it's useful.

No comments:

Post a Comment