Thursday, June 11, 2020

Public database of non-pharmaceutical interventions for COVID-19

This week we published the new paper The effect of large-scale anti-contagion policies on the COVID-19 pandemic which tries to estimate effect of of over 1,700 policies on the day-over-day growth rate of infections. We compiled subnational data across six countries (China, South Korea, Italy, Iran, France, and the USA) and tried to to break down the effects of different overlapping policies.

One result of the paper is that during the period of study, we estimate that there would be roughly 500 million more COVID-19 infections by early April, across these six countries, in the absence of policy. We get to this number by linking up some reduced-form econometric tools from GDP-growth modeling with some SIR/SIER models from epidemiology. If you don't feel like reading a long paper, you can just watch these GIFs of China and Italy (more are here).  Left is what actually happened (area of red circles = cumulative confimred cases), right is a "no policy" simulation:

click to enlarge
click to enlarge

Or if you are more serious about research but still don't want to read the paper, we have two videos explaining the paper here -- one is a presentation at the HELP Seminar, the other is a 3 min summary that involves a lot of figurative and literal arm-waving.

The hardest part of the study was actually putting together an original standardized dataset of all 1,717 NPIs. We have posted all the data here (alongside the code). We are hoping that people will use this new data set for other projects and welcome feedback and/or comments, especially if you think we missed something (much of this was collected by hand, so it is a dataset that will certainly improve over time). The dataset is detailed in the Supplementary Information of the paper. If you use the data, please just cite the article as the source.

We constructed the data set to be at the same spatial resolution of the the most finely resolved case data we could obtain, which was the second administrative level for Italy and China (see figs above), and the first admin level for everyone else. The policies were actually deployed at a variety of admin levels, depending on the country and policy. Here's the table from the appendix summarizing the policies in the data set:


We also worked very hard to standardize policy definitions as much as possible across different countries, e.g. we tried to make "business closure" or "no gathering" mean something similar across different countries, but there is substantial nuance and some things could not be standardized, so please make sure to consult the documentation.

Note that there is also variation in the intensity of policies in the data set. This comes from the fact that many policies are encoded as composites (e.g. there are multiple policies that make up a single policy definition) and also because many policies are not deployed uniformly across an entire administrative unit, so we computed the population-weighted "exposure" to each policy (e.g. if only half the population of a US state adopted a policy, since many are set at the county level, this policy got coded as one-half). Here's a plot of policy intensity over time for the US sample (other plots are available here, but weren't in the paper because we ran out of our figure allotment):


 non-pharmaceutical interventions over time in the USA (click to enlarge) 


Several folks (including a journal referee and someone at the DOD) asked if we planned to keep collecting this data and making it public. In principle, we would love to do that, but the entire team did the whole project pro bono and have since had to go back to our normal jobs. That said, if someone reading this wants to to support/otherwise enable expansion of this data for the public good, you know where to find me, since I won't be going anywhere for a while...

Let us know if you use the data, we'll excited to hear if it's useful.

Sunday, March 8, 2020

COVID-19 reduces economic activity, which reduces pollution, which saves lives.


COVID-19 is a massive global economic and health challenge, having caused >3500 global deaths as of this writing (Mar 8) and untold economic and social disruption.  This disruption is only likely to increase in coming days in regions where the epidemic is just beginning. Strangely, this disruption could also have unexpected health benefits -- and these benefits could be quite large in certain parts of the world.  Below I calculate that the reductions in air pollution in China caused by this economic disruption likely saved twenty times more lives in China than have currently been lost directly due to infection with the virus in that country.

None of my calculations support any idea that pandemics are good for health. The effects I calculate just represent health benefits from the air pollution changes wrought by the economic disruption, and do not account for the many other short- or long-term negative consequences of this disruption on health or other outcomes; these harms likely vastly exceed any health benefits from reduced air pollution.  [Edit: added this paragraph 3/16 to emphasize the lessons from my findings, in case readers don't make it further..]

Okay, on to the calculation.

A few weeks ago, NASA published striking satellite images of the massive reduction in air pollution (specifically, NO2) over China resulting from the economic slow-down in that country following it's aggressive response to COVID-19.

Source: NASA
Separate analyses indeed found that ground-based concentrations of key pollutants -- namely PM2.5 -- fell substantially across much of the country.  These reductions were not uniform.  In northern cities such as Beijing, where much of wintertime pollution comes from winter heating, reductions were absent.  But in more southern cities such as Shanghai and Wuhan where wintertime pollution is mainly from cars and smaller industry, pollution declines appeared to be dramatic.

Given the huge amount of evidence that breathing dirty air contributes heavily to premature mortality, a natural -- if admittedly strange -- question is whether the lives saved from this reduction in pollution caused by economic disruption from COVID-19 exceeds the death toll from the virus itself.  Even under very conservative assumptions, I think the answer is a clear "yes".

Here are the ingredients to doing this calculation [Edit: see additional details I added at end of this post, if really wanting to nerd out].  First we need to know how much the economic disruption caused by COVID-19 in China reduced air pollution in the country.  The estimates already discussed suggest about a 10ug/m3 reduction in PM across China in Jan-Feb of 2020 relative to the same months in the previous 2 years (I get this from eyeballing the figure mid-way down the blog post).  To confirm these results in independent data, I downloaded hourly PM2.5 data from US government sensors in four main cities in China (Beijing, Shanghai, Chengdu, and Guangzhou).  I calculated the daily average (and trailing 7-day mean) of PM2.5 since Jan 2016, and then again compared Jan-Feb 2020 to all other Jan-Feb periods in these cities.

I find results that are highly consistent with the above analysis.  Below is a plot of the data in each city. The blue line is the 7-day average in 2020, and the thick red line is the same average over the years 2016-19 (thin red lines are each individual year in that span).  On average across these four cities, I find an average daily reduction of 15-17ug/m3 PM2.5 across both Jan-Feb 2020 relative to the average in the previous four years.  This average difference is highly statistically significant.  As in the above analysis, Beijing appears to be the outlier, with no decline in PM relative to past years when analyzed on its own.

PM2.5 concentrations in four Chinese cities in Jan-Feb 2016-2019 (red lines) vs same period in 2020 (blue lines). All cities except Beijing show substantial overall reductions during the period. 

To be very conservative, let’s then assume COVID-19 reduced PM2.5 on average by 10ug/m3 for in China, that this effect only lasted 2 months, and that only some urban areas (and no rural areas) experienced this change (see below).

Next we need to know how these changes in air pollution translated into reductions in premature mortality.  Doing this requires knowing three numbers:
  1. The change in the age specific mortality rate per unit change in PM2.5.  For this we use estimates from He et al 2016, who studied the effects of air quality improvements around the Beijing Olympics on mortality rates -- a quasi-experimental setting on air quality not unlike that induced by COVID-19.  He et al found large reductions in mortality for children under five and for adults over 65, with monthly under-5 mortality increasing 2.9% for every 1ug/m3 increase in PM2.5, and about 1.4% for people over 70 years old (note:  they measure impacts using PM10, and to convert to PM2.5 we use the common assumption in the literature that PM2.5 = 0.7*PM10 and that all impacts of PM10 are through the PM2.5 component). Estimates from the He et al paper are broadly consistent with other quasi-experimental studies on the effects of air pollution on infant health, including estimates from the US, Turkey, and Africa (see Fig 5b here).  As a more conservative approach, we also calculate mortality using 1% increase in mortality per 1ug increase in PM2.5 for both kids and older people.  In all cases, we very conservatively assume that no one between the ages of 5 and 70 is affected.  
  2. The baseline age-specific mortality rates for kids and older people, which we take from Global Burden of Disease estimates.  I note a large apparent discrepancy between the GBD numbers, which give an under-5 mortality rate of 205 per 100,000 live births in China in the most recent year, and the WB/Unicef estimate, which puts the number at 860 per 100,000 live births (4x higher!).   I use the much smaller GBD estimate in my calculations to be conservative, but someone should reconcile these...  
  3. The total population affected by the changes in air pollution.  For this we take the total Chinese population as estimated by the UN, and conservatively assume that only 50% of the population are affected by the change in air quality.  Since roughly 60% of Chinese live in urban areas, this is like assuming there is no effect in rural areas and some urban areas remain unaffected. 

Putting these numbers together [see table below for details] yields some very large reductions in premature mortality.  Using the He et al 2016 estimates of the impact of changes in PM on mortality, I calculate that having 2 months of 10ug/m3 reductions in PM2.5 likely has saved the lives of 4,000 kids under 5 and 73,000 adults over 70 in China.  Using even more conservative estimates of 10% reduction in mortality per 10ug change, I estimate 1400 under-5 lives saved and 51700 over-70 lives saved.  Even under these more conservative assumptions, the lives saved due to the pollution reductions are roughly 20x the number of lives that have been directly lost to the virus (based on March 8 estimates of 3100 Chinese COVID-19 deaths, taken from here). 

What's the lesson here?  It seems clearly incorrect and foolhardy to conclude that pandemics are good for health. Again I emphasize that the effects calculated above are just the health benefits of the air pollution changes, and do not account for the many other short- or long-term negative consequences of social and economic disruption on health or other outcomes; these harms could exceed any health benefits from reduced air pollution.  But the calculation is perhaps a useful reminder of the often-hidden health consequences of the status quo, i.e. the substantial costs that our current way of doing things exacts on our health and livelihoods.

Might COVID-19 help us see this more clearly?  In my narrow academic world -- in which most conferences and academic meetings have now been cancelled due to COVID-19, and where many folks are cancelling talk invitations and not taking flights -- it might be a nice opportunity to re-think our production function with regard to travel.  For most (all?) of us academics, flying on airplanes is by far the most polluting thing we do.  Perhaps COVID-19, if we survive it, will help us find less polluting ways to do our jobs.  More broadly, the fact that disruption of this magnitude could actually lead to some large (partial) benefits suggests that our normal way of doing things might need disrupting.

[Thanks to Sam Heft-Neal for helping me double check these calculations; errors are my own].

----------------------------------------------------------------------------------------------
Edited 3/9 to add additional details below on calculations and assumptions. 

Here are the actual values and math I used, either using He et al 2016 estimates of the PM/mortality relationship or the more conservative 1% change in mortality per 1 ug/m3 PM2.5 estimate.  To calculate the % change in mortality rate from the observed change in PM2.5, I multiply (a) by (c).  This number is then multiplied by the baseline monthly mortality rate in (d) to get the total change in the mortality rate, which is then multiplied by the total affected population in each age group (e*f).

age group reduction in monthly pm2.5 % change in monthly mortality rate per 10 ug pm 10 % change in mortality per 1 ug PM2.5 baseline monthly mortality rate per 100,000 for each age group population in each age group (100,000s) percent of population affected two month reduction in mortality

(a) (b) (c) (d) (e) (f) (g)








Using He et al 2016 PM10 estimates in (b)
under 5 -10 20 2.9 17.1 839.3 0.5 -4097
over 70 -10 10 1.4 523.8 981.1 0.5 -73421
total





-77518








Using 1%/ug PM2.5 estimates in (c)
under 5 -10
1.0 17.1 839.3 0.5 -1434
over 70 -10
1.0 523.8 981.1 0.5 -51395
total





-52829


To get the estimates in (a), I observe daily data for four chinese cities back to Jan 1 2016.  I define a COVID-19 treatment dummy equal to one if the year==2020, and then run a panel regression of daily PM2.5 on the treatment dummy plus city-by-day-of-year fixed effects.  The sample is all days in Jan-early March.  Basically this calculates the daily PM2.5 difference between 2020 and 2016-19, averaged across all sites and days.  I can run this on the daily data or on the 7-day running mean, and with city-by-d.o.y fixed effects or city + d.o.y fixed effects and I get answers that are all between 15-18 ug/m3 reductions in PM2.5.  We very conservatively round this number down to 10 in column (a). 

What are some of the key assumptions in my overall analysis?  There are lots:
  • This is only the partial effect of air pollution; it is by no means the overall effect of COVID-19 on mortality.  Indeed, the broader disruption caused by COVID-19 could cause many additional deaths that are not directly attributable to being infected with the virus -- e.g. due to declines in households' economic wellbeing, or to the difficulty in accessing health services for non-COVID illnesses.  Again, I am absolutely not saying that pandemics are good for health.  
  • The key assumption in using the He et al 2016 estimates (or the 1%/1ug conservative summary estimate from quasi-experimental studies) is that changes in outdoor PM2.5 concentrations are a sufficient statistic for measuring health impacts.  One worry is that, instead of being at work, people are staying at home, and that home indoor air pollution is worse than what they would have been exposed to otherwise.  While it is true that indoor air pollution can be incredibly high in homes that rely on biomass burning for cooking and heating, existing evidence suggests that even in cold regions, urban Chinese residents probably have better air quality inside their home than outside it.  E.g see here.  So the key question to me is whether other behaviors changed in response to COVID-19 that made individuals exposures look a lot different than they did in the Beijing setting in He et al 2016.  Maybe there is a case to be made there but not sure what it is.  There's a prima facie case that the Beijing setting looks a lot like what we're worried about here:  an acute 2-month event that led to large but temporary changes in air pollution.  
  • Is mortality from COVID-19 interacting with mortality from PM?  One possibility is that there are enough COVID-19 infections to actually make people more susceptible to the negative impacts of air pollution.  But this would increase rather than decrease deaths from PM!  The other possibility is that the people who have very sadly passed away from COVID would have been those most likely to pass away from PM exposure.  But even if this were 100% true, it would only account for ~5% of the overall predicted mortality.  
  • We are pinning the entire PM2.5 difference between Jan-Feb 2020 and Jan-Feb 2016-19 on COVID-19.  (Note that we estimate a 17ug/m3 difference, and have rounded that down to 10 ug/mg, so in effect are only pinning about 60% of the change on COVID).  Without more careful analysis, we can't really know whether this is fair or not.  Maybe something else very big was going on at exactly the same time in China?  Seems unlikely but my analysis has nothing to say about that. 
  • My estimates are a prediction of mortality impacts, not a measurement.  They are not proof that anything has happened.  In a few years, there will likely be enough data to actually try to measure what the overall effect was of the COVID-19 epidemic on health outcomes.  You could compare changes in mortality in high-exposure locations to changes in mortality in lower-exposure locations, for instance.  But this study is not yet possible, as the epidemic is still underway and the comprehensive all-cause mortality data not yet (to my knowledge) available.