Monday, October 19, 2020

The GDP-temperature relationship - some thoughts on Newell, Prest, and Sexton 2018

Quantifying the potential effects of climate on aggregate economic outcomes like GDP is a key part of understanding the broader economic impacts of a changing climate.  Lots of studies now use modern panel econometrics to study how past fluctuations in temperature have affected GDP around the world, including the landmark 2012 paper by Dell, Jones, and Olken (henceforth DJO), our 2015 paper (henceforth BHM), and a number of more recent papers that either analyze new data on this topic (e.g. subnational GDP or income data, see here or here or here) or revisit earlier results. 

Newell, Prest, and Sexton 2018 (draft here, henceforth NPS) is one recent paper that revisits our earlier work in BHM. Because we've gotten a lot of questions about this paper, Sol and I wanted to share our views of what we think we can learn from NPS.  In short, our view is that their approach is not suitable to the question they are trying to ask, and that their conclusions as stated in their abstract (at least in their current draft) are directly contradicted by their results.  This is important because their conclusions appear to shed important light on the aggregate economic impacts of unmitigated climate change. Unfortunately we do not believe that this is the case. 

NPS seek to take a data-driven approach to resolving a number of empirical issues that come up in our earlier work.  These include: (1) Does temperature affect the level or growth rate of GDP? (2) What is the "right" set of fixed effects or time trends to include as controls in analyzing the effect of temperature on GDP? (3) And what is the "right" functional form to describe the relationship between temperature and GDP changes?  These choices -- particularly (1) and (3) -- are certainly influential in the findings of earlier studies.  If temperature affects growth rates rather than levels, this can imply huge effects of future temperature increases on economic output, as small impacts on growth aggregate over time.  If temperature has a globally nonlinear effect on output, as we argue in BHM, this suggests that both wealthy and poor countries can be affected by changes in temperature -- and not only poor countries, as suggested in earlier analyses.  Resolving these questions is key to understanding the impacts of a warming climate, and its great that papers like NPS are taking them on.

However, we have some serious qualms with the approach that NPS take to answer these questions.  NPS wish to use cross-validation to select which set of the above choices performs "best" in describing historical data.  Cross-validation is a technique in which available data are split between disjoint training and testing datasets, and candidate models are trained on the training data and then evaluated on the held-out test data using a statistic of interest (typically RMSE or r-squared in these sorts of applications).  The model with the lowest RMSE or highest r-squared on test data is then chosen as the preferred model. 

Inference, not prediction.  This technique works great when your goal is prediction.  But what if your goal is causal inference?  i.e, in our case, in isolating variation in temperature from other correlated factors that might also affect GDP?  It's not clear at all that models that perform the best on a prediction task will also yield the right causal results.  For instance, prices for hotel rooms tend to be high when occupancy rates are high, but only a foolish hotel owner would raise prices to increase occupancy (h/t Susan Athey who I stole this example from).  A good predictive model can get the causal story wrong. 

This is clearly relevant in the panel studies of temperature on GDP.  In existing work, great care is taken to choose controls that account for a broad range of time-invariant and time-varying factors that might be correlated with both temperature and GDP.  These typically take the form of unit or time fixed effects and/or time trends. Again the goal in including these is not to better predict GDP but to isolate variation in temperature that is uncorrelated with other factors that could affect GDP, in order to identify the causal effect of temperature on GDP.  The chosen set of controls constitute the paper's "identification strategy", and in this fixed effect setup unfortunately there is no clear data-driven approach -- including cross-validation -- for selecting these controls.  The test for readers in these papers is not:  do the authors predict GDP really well?  It is instead: do the authors plausibly isolate the role of temperature from other confounding factors?

Growth vs level effects.  The main question NPS are asking is whether the causal effect of a temperature shock has "level effects", where the economy is hurt in one year but catches up in the next year, or "growth effects", where the economy is permanently smaller.  This would require an identification strategy, which is what most of the literature has focused on following the method artfully outlined by DJO:  isolate the role of temperature from other confounding factors using choices about fixed effects that most plausibly achieve this unconfounding, and then distinguish growth from level effects by looking at the sum of contemporaneous and lagged effects.  If the sum is zero, this is evidence of level effects, and if it's not zero, evidence of growth effects.   The current manuscript does not have an identification strategy for measuring these lagged effects, and instead is using goodness of fit metrics to draw causal inferences about the magnitude of these effects.  The tool they are using is again not commensurate with the question they are asking. 

Errors in interpretation. These conceptual issues aside, the authors' conclusions in the abstract and introduction of their paper about the results from their cross validation exercise do not appear consistent with their actual findings as reported in the tables.  The authors conclude that "the best performing models have non-linear temperature effects on GDP levels", but the authors demonstrate no statistically distinguishable differences in results between levels and growth models in their main tables (Table 2 and 3, and A1-A4), nor between linear and non-linear models.  This directly contradicts the statement in their abstract. 

To be precise, the authors state in the abstract "The best-performing models have non-linear temperature effects on GDP levels."  But then on page 27 they clearly state: "The MCS ["model confidence sets", or the set of best performing models whose performance is statistically indistinguishable from one another], however, does not discern among temperature functional forms or growth and level effects." This is in reference to Table 2, reproduced below; models in the MCS are denoted with asterisks, and a range of both growth and levels models have asterisks, meaning their performance cannot be statistically distinguished from one another. 

So, again, the paper's abstract is not consistent with its own stated results.  They do find that the model with quadratic time trends (as used by BHM) is outperformed by models without country time trends -- but again, the purpose of those time trends is to remove potential confounding variation, not to perfectly predict GDP. See here for a simple look at the data on whether individual countries growth rates have differential time trends that might make you worried about time-trending unobservables at a country level [spoiler: yes they do].

Errors in implementation.  Even if cross validation was the right tool here, the authors make some non-standard choices in how the CV is implemented which we again believe make the results very hard to interpret.  Instead of first splitting the data between disjoint train and test sets, they first transform the data by regressing out the fixed effects, and then split the residualized data into train and test.  But the remaining variation in the residualized data will be very different based on what set of fixed effects have been regressed out, and this will directly affect resulting estimates of the RMSE.  It is thus not a surprise that models with region-year FE have lower RMSE than models with year FE (in models with no time trends) -- region-year FE's almost mechanically take out more of the variation.  But this means you can't meaningfully compare RMSE's across different fixed effects in the way that they are doing -- you are literally looking at 2 different datasets with 2 different outcomes. You can only in principle compare functional forms within a given choice of FE.  

Imagine the following: you have outcome Y, predictor X, and covariates W and Z.  Z is a confound: correlated with both X and Y.  W is not correlated with Y but X.

In version 1 you partial Z out of both X and Y, and generate residualized values Y_1 and X_1.

In version 2 you partial W out of both X and Y, and generate residualized values Y_2 and X_2. 

This is in effect what NPS do, and then they want us to compare Y_1 = f(X_1) versus Y_2 = f(X_2).  But this clearly doesn't make sense, because Y_2 and X_2 still have the confounding variation of Z in them, and Y_1 and Y_2 are no longer measuring the same thing. So comparing the predictive power of f(X_1) vs f(X_2) is not meaningful.  It is also not how cross validation is supposed to work - instead, we should be comparing predictive performance on the same Y variables. 

Further hazards in cross validation. While we don't agree that CV can be used to select the fixed effects in this setting, we then do agree with NPS that cross validation could in principle be used to identify the "right" functional form that relates temperature to GDP (conditional on chosen controls). e.g is the relationship linear? quadratic? something fancier?  This works because the causal part has already been taken care of by the fixed effects (or so one hopes), and so what's left is to best describe the functional form of how x relates to y.  Cross validation for this purpose has been successfully implemented settings in which temperature explains a lot of the variation in the outcome -- e.g. in studies of agricultural productivity (see Schlenker and Roberts 2009 for an early example in this literature).  

But unfortunately when it comes to GDP, while temperature has a strong and statistically significant relationship to GDP in past work, it does not explain a lot of the overall interannual variation in GDP;  GDP growth is noisy and poorly predicted even by the hotshot macro guys.  In this low r-squared environment, selecting functional form by cross validation can be difficult and perhaps hazardous.  It's too easy to overfit to noise in the training set.

To see this, consider the following simulation in which we specify a known cubic relationship between y and x and then try to use cross validation to recover the "right" order of polynomial, and study our ability to do so as we crank up the noise in y.  We do this a bunch of times, each time training the regression model on 80% of the data and testing on 20%. We calculate RMSE on the test data and compute average % reduction in RMSE relative to a model with only an intercept.  As shown in the figure below, we can mostly reject the linear model but have a lot of trouble picking out the cubic model from anything else non-linear, particularly in settings with the overall explained variation in y is small.  Given that temperature explains <5% of the variation in GDP growth rates (analogous to the far right grouping of bars in each plot), cross validation is going to really struggle to pick the "right" functional form. 

To be clear, the point we're making in this section is just about functional form, not about growth versus levels.  Even for this more narrow task in which cross validation is potentially appropriate, it does not end up being a useful tool because the model overfits to noise in the training set.

This exercise does illustrate an important point, however:  right now the data are consistent with a bunch of potential functional forms that we can't easily distinguish.  We argue in BHM that there is pretty consistent evidence that the temperature/growth relationship is non-linear and roughly quadratic at the global level, but we certainly can't rule out higher order polynomials and we say so in that paper.  

Wrapping up. So where does this leave us?  Techniques like cross validation certainly still have utility for other approaches to causal inference questions (e.g. in selecting suitable controls for treated units in synthetic control settings), and there might be opportunities to apply those approaches in this domain.  Similarly, we fully agree with NPS's broader point that using data-driven approaches to make key analytical decisions in climate impacts (or any other empirical) paper should be our goal.  But in this particular application, we don't think that the specific approach taken by NPS has improved our understanding of climate/economy linkages.  We look forward to working with them and others to continue to make progress on these issues. 

Friday, September 11, 2020

Indirect mortality from recent wildfires in CA

[This post is joint with Sam Heft-Neal]

Wildfire activity in the US West coast has been unprecedented in the last month, with fires burning larger and faster than ever experienced.  Multiple communities in the paths of these fires have been entirely destroyed, and wildfire smoke has blanketed huge swaths of the West Coast for weeks.

While media coverage on fire impacts has justifiably focused on the lives that have immediately been lost to the fire, the total costs in terms of human lives and health is likely far larger, due to the immense amount of smoke that has been inhaled over the last 3 weeks by the very large number of people living on the West Coast.  

How large might these effects be?  We know a lot about how exposure to certain air pollutants -- in particular, PM2.5 -- affects a range of health outcomes, including mortality.  Recent wildfire activity has led to a massive increase in PM2.5 above normal levels.  As anyone who lives in CA or has watched the news knows, air quality has been terrible, and the monitoring data of course bear this out. Below is a plot of the deviation of daily PM2.5 in 2020 from the previous 5yr average, beginning Aug 1 and going through Sep 10th 2020.  This is based on station data from EPA AirNow, and we average over each reporting station in each location on a given day. 

Difference in PM2.5 on each day, 2020 relative to 2015-19 average. Data are from EPA AirNow, and represent averages over all stations reporting in each area.

For most locations in CA, PM2.5 was 10-50ug/m3 higher than normal on many days, which is a massive increase above average (which in the Bay Area is typically ~ 10ug).  For simplicity, we are going to attribute all the excess PM2.5 in Aug/Sept 2020 (relative to 2015-19 average) to wildfires. While this is probably not completely correct, it's not outlandish: the huge observed spikes were definitely due to the wildfires.  Before about Aug 15, PM2.5 was running somewhat below recent averages across much of CA; fires kicked up in mid august, and PM2.5 levels shot through the roof. 

What are the health consequences of this change in exposure? The best paper on the short run mortality and morbidity consequences of PM exposure is a recent (very impressive) paper by Deryugina et al published in the American Economic Review.  Using very detailed Medicare data, they estimate that a +1ug PM2.5 increase on one day increases deaths over the next three days by 0.7 per 1 million people over 65+, and increases ER visits among the elderly by 2.7 per million people.  We emphasize that these result is for all PM2.5, not just PM2.5 from smoke; the literature is still undecided whether the latter has differential health effects, so we will just assume here that the effects are similar. 

For simplicity, let's say total PM2.5 values (excess PM2.5 since Aug 1, summed over days) were 300ug/m3 above normal across CA due to the wildfires. This will be a little too high for LA, but will be way too low for most of NorCal and the central valley. 

There are about 6 million people aged 65+ in CA. Applying the Deryugina et al estimates to the change in PM and the exposed population, we arrive at 1200 excess deaths (deaths that would not have happened otherwise) and 4800 additional ER visits among the elderly.  If we use the Deryugina et al estimates of how much one day of additional PM2.5 increases mortality over the next month (not just next three days), the estimated number of deaths rises to 3000.  This is just in CA alone!  And just for people aged 65+.  Oregon and Washington are being hit very hard right now too, and non-elderly are also surely affected.  So this is likely a substantial lower bound on total health costs. 

There are a number of caveats to this calculation that are important to consider. In most cases they suggest these numbers above could be conservative. Note that we will never know the "true" number because we don't observe the counterfactual -- i.e what would have happened in the absence of the dramatic decline in air quality.  But the purpose of the Deryugina et al paper is to get us as close as we could possibly get to that counterfactual.  Some caveats:

  • Again, we attribute all the change in PM2.5 in 2020 (relative to 5-yr avg) as due to wildfire.  While this is probably not fully correct, the big spikes were definitely from wildfires.  Further analysis will help clarify this. 
  • PM2.5 from wildfire smoke might not have the same effect as overall PM2.5 (only some of which comes from wildfire).  The science isn't clear on this yet.
  • We are in the midst of the COVID pandemic, and this has as-yet-unknown implications for our understanding of how air pollution affects health outcomes. Early evidence seems to suggest that poor air quality could worsen COVID-related outcomes, and if that's the case then our numbers above could be lower bounds.
  • Numbers above only look at elderly mortality and ER visits, and it's likely that other groups are very substantially affected (e.g. the very young, those with pre-existing cardiovascular or respiratory conditions).   Again, above numbers likely a dramatic lower bound on overall health consequences for that reason.
  • We don't know a ton about health responses at super high exposures.  Does it tail off?  Does it get way worse?  You can find papers telling you both.  Right now we're assuming response is linear, which is consistent with a lot of the literature. 
  • We also don't have a full picture of "displacement" - i.e. if someone passes away due to air pollution exposure, maybe they were already extremely sick and so would have passed away in the near term anyway.  Deryugina et al explore this to some degree by looking at 3-day (and longer) responses to a single day of pollution increase.  They find effects of 1-day exposure over the next month to be about twice as big as the 3-day estimates we use, so again this could suggest our numbers are a lower bound. 
  • Effects could also amplify over time, as its probably the case that exposure to 7 days in a row of terrible air is worse that exposure to 7 days spread out.  Our numbers do not consider this, nor do we know of good estimates in the literature that would help with this. Please point them out if so!

These overall effects can be in large part attributable to climate change, which has dramatically increased the likelihood and severity of wildfire.  Understanding this pathway of climate impacts is, we think, underappreciated, and something that we're working on (along with lots of others!).

Monday, August 31, 2020

Seminar talk on Spatial First Differences

Hannah Druckenmiller and I wrote a paper on a new method we developed that allows you to do cross-sectional regressions while eliminating a lot (or all) omitted variables bias. We are really excited about it and it seems to be performing well in a bunch of cases

Since people seem to like watching TV more than reading math, here's a recent talk I gave explaining the method to the All New Zealand Economics Series. (Since we don't want to discriminate against the math-reading-folks-who-don't-want-to-read-the-whole-paper, we'll also post a blog post explaining the basics...)

If you want to try it out, code is here.

Thursday, June 11, 2020

Public database of non-pharmaceutical interventions for COVID-19

This week we published the new paper The effect of large-scale anti-contagion policies on the COVID-19 pandemic which tries to estimate effect of of over 1,700 policies on the day-over-day growth rate of infections. We compiled subnational data across six countries (China, South Korea, Italy, Iran, France, and the USA) and tried to to break down the effects of different overlapping policies.

One result of the paper is that during the period of study, we estimate that there would be roughly 500 million more COVID-19 infections by early April, across these six countries, in the absence of policy. We get to this number by linking up some reduced-form econometric tools from GDP-growth modeling with some SIR/SIER models from epidemiology. If you don't feel like reading a long paper, you can just watch these GIFs of China and Italy (more are here).  Left is what actually happened (area of red circles = cumulative confimred cases), right is a "no policy" simulation:

click to enlarge
click to enlarge

Or if you are more serious about research but still don't want to read the paper, we have two videos explaining the paper here -- one is a presentation at the HELP Seminar, the other is a 3 min summary that involves a lot of figurative and literal arm-waving.

The hardest part of the study was actually putting together an original standardized dataset of all 1,717 NPIs. We have posted all the data here (alongside the code). We are hoping that people will use this new data set for other projects and welcome feedback and/or comments, especially if you think we missed something (much of this was collected by hand, so it is a dataset that will certainly improve over time). The dataset is detailed in the Supplementary Information of the paper. If you use the data, please just cite the article as the source.

We constructed the data set to be at the same spatial resolution of the the most finely resolved case data we could obtain, which was the second administrative level for Italy and China (see figs above), and the first admin level for everyone else. The policies were actually deployed at a variety of admin levels, depending on the country and policy. Here's the table from the appendix summarizing the policies in the data set:

We also worked very hard to standardize policy definitions as much as possible across different countries, e.g. we tried to make "business closure" or "no gathering" mean something similar across different countries, but there is substantial nuance and some things could not be standardized, so please make sure to consult the documentation.

Note that there is also variation in the intensity of policies in the data set. This comes from the fact that many policies are encoded as composites (e.g. there are multiple policies that make up a single policy definition) and also because many policies are not deployed uniformly across an entire administrative unit, so we computed the population-weighted "exposure" to each policy (e.g. if only half the population of a US state adopted a policy, since many are set at the county level, this policy got coded as one-half). Here's a plot of policy intensity over time for the US sample (other plots are available here, but weren't in the paper because we ran out of our figure allotment):

 non-pharmaceutical interventions over time in the USA (click to enlarge) 

Several folks (including a journal referee and someone at the DOD) asked if we planned to keep collecting this data and making it public. In principle, we would love to do that, but the entire team did the whole project pro bono and have since had to go back to our normal jobs. That said, if someone reading this wants to to support/otherwise enable expansion of this data for the public good, you know where to find me, since I won't be going anywhere for a while...

Let us know if you use the data, we'll excited to hear if it's useful.

Sunday, March 8, 2020

COVID-19 reduces economic activity, which reduces pollution, which saves lives.

COVID-19 is a massive global economic and health challenge, having caused >3500 global deaths as of this writing (Mar 8) and untold economic and social disruption.  This disruption is only likely to increase in coming days in regions where the epidemic is just beginning. Strangely, this disruption could also have unexpected health benefits -- and these benefits could be quite large in certain parts of the world.  Below I calculate that the reductions in air pollution in China caused by this economic disruption likely saved twenty times more lives in China than have currently been lost directly due to infection with the virus in that country.

None of my calculations support any idea that pandemics are good for health. The effects I calculate just represent health benefits from the air pollution changes wrought by the economic disruption, and do not account for the many other short- or long-term negative consequences of this disruption on health or other outcomes; these harms likely vastly exceed any health benefits from reduced air pollution.  [Edit: added this paragraph 3/16 to emphasize the lessons from my findings, in case readers don't make it further..]

Okay, on to the calculation.

A few weeks ago, NASA published striking satellite images of the massive reduction in air pollution (specifically, NO2) over China resulting from the economic slow-down in that country following it's aggressive response to COVID-19.

Source: NASA
Separate analyses indeed found that ground-based concentrations of key pollutants -- namely PM2.5 -- fell substantially across much of the country.  These reductions were not uniform.  In northern cities such as Beijing, where much of wintertime pollution comes from winter heating, reductions were absent.  But in more southern cities such as Shanghai and Wuhan where wintertime pollution is mainly from cars and smaller industry, pollution declines appeared to be dramatic.

Given the huge amount of evidence that breathing dirty air contributes heavily to premature mortality, a natural -- if admittedly strange -- question is whether the lives saved from this reduction in pollution caused by economic disruption from COVID-19 exceeds the death toll from the virus itself.  Even under very conservative assumptions, I think the answer is a clear "yes".

Here are the ingredients to doing this calculation [Edit: see additional details I added at end of this post, if really wanting to nerd out].  First we need to know how much the economic disruption caused by COVID-19 in China reduced air pollution in the country.  The estimates already discussed suggest about a 10ug/m3 reduction in PM across China in Jan-Feb of 2020 relative to the same months in the previous 2 years (I get this from eyeballing the figure mid-way down the blog post).  To confirm these results in independent data, I downloaded hourly PM2.5 data from US government sensors in four main cities in China (Beijing, Shanghai, Chengdu, and Guangzhou).  I calculated the daily average (and trailing 7-day mean) of PM2.5 since Jan 2016, and then again compared Jan-Feb 2020 to all other Jan-Feb periods in these cities.

I find results that are highly consistent with the above analysis.  Below is a plot of the data in each city. The blue line is the 7-day average in 2020, and the thick red line is the same average over the years 2016-19 (thin red lines are each individual year in that span).  On average across these four cities, I find an average daily reduction of 15-17ug/m3 PM2.5 across both Jan-Feb 2020 relative to the average in the previous four years.  This average difference is highly statistically significant.  As in the above analysis, Beijing appears to be the outlier, with no decline in PM relative to past years when analyzed on its own.

PM2.5 concentrations in four Chinese cities in Jan-Feb 2016-2019 (red lines) vs same period in 2020 (blue lines). All cities except Beijing show substantial overall reductions during the period. 

To be very conservative, let’s then assume COVID-19 reduced PM2.5 on average by 10ug/m3 for in China, that this effect only lasted 2 months, and that only some urban areas (and no rural areas) experienced this change (see below).

Next we need to know how these changes in air pollution translated into reductions in premature mortality.  Doing this requires knowing three numbers:
  1. The change in the age specific mortality rate per unit change in PM2.5.  For this we use estimates from He et al 2016, who studied the effects of air quality improvements around the Beijing Olympics on mortality rates -- a quasi-experimental setting on air quality not unlike that induced by COVID-19.  He et al found large reductions in mortality for children under five and for adults over 65, with monthly under-5 mortality increasing 2.9% for every 1ug/m3 increase in PM2.5, and about 1.4% for people over 70 years old (note:  they measure impacts using PM10, and to convert to PM2.5 we use the common assumption in the literature that PM2.5 = 0.7*PM10 and that all impacts of PM10 are through the PM2.5 component). Estimates from the He et al paper are broadly consistent with other quasi-experimental studies on the effects of air pollution on infant health, including estimates from the US, Turkey, and Africa (see Fig 5b here).  As a more conservative approach, we also calculate mortality using 1% increase in mortality per 1ug increase in PM2.5 for both kids and older people.  In all cases, we very conservatively assume that no one between the ages of 5 and 70 is affected.  
  2. The baseline age-specific mortality rates for kids and older people, which we take from Global Burden of Disease estimates.  I note a large apparent discrepancy between the GBD numbers, which give an under-5 mortality rate of 205 per 100,000 live births in China in the most recent year, and the WB/Unicef estimate, which puts the number at 860 per 100,000 live births (4x higher!).   I use the much smaller GBD estimate in my calculations to be conservative, but someone should reconcile these...  
  3. The total population affected by the changes in air pollution.  For this we take the total Chinese population as estimated by the UN, and conservatively assume that only 50% of the population are affected by the change in air quality.  Since roughly 60% of Chinese live in urban areas, this is like assuming there is no effect in rural areas and some urban areas remain unaffected. 

Putting these numbers together [see table below for details] yields some very large reductions in premature mortality.  Using the He et al 2016 estimates of the impact of changes in PM on mortality, I calculate that having 2 months of 10ug/m3 reductions in PM2.5 likely has saved the lives of 4,000 kids under 5 and 73,000 adults over 70 in China.  Using even more conservative estimates of 10% reduction in mortality per 10ug change, I estimate 1400 under-5 lives saved and 51700 over-70 lives saved.  Even under these more conservative assumptions, the lives saved due to the pollution reductions are roughly 20x the number of lives that have been directly lost to the virus (based on March 8 estimates of 3100 Chinese COVID-19 deaths, taken from here). 

What's the lesson here?  It seems clearly incorrect and foolhardy to conclude that pandemics are good for health. Again I emphasize that the effects calculated above are just the health benefits of the air pollution changes, and do not account for the many other short- or long-term negative consequences of social and economic disruption on health or other outcomes; these harms could exceed any health benefits from reduced air pollution.  But the calculation is perhaps a useful reminder of the often-hidden health consequences of the status quo, i.e. the substantial costs that our current way of doing things exacts on our health and livelihoods.

Might COVID-19 help us see this more clearly?  In my narrow academic world -- in which most conferences and academic meetings have now been cancelled due to COVID-19, and where many folks are cancelling talk invitations and not taking flights -- it might be a nice opportunity to re-think our production function with regard to travel.  For most (all?) of us academics, flying on airplanes is by far the most polluting thing we do.  Perhaps COVID-19, if we survive it, will help us find less polluting ways to do our jobs.  More broadly, the fact that disruption of this magnitude could actually lead to some large (partial) benefits suggests that our normal way of doing things might need disrupting.

[Thanks to Sam Heft-Neal for helping me double check these calculations; errors are my own].

Edited 3/9 to add additional details below on calculations and assumptions. 

Here are the actual values and math I used, either using He et al 2016 estimates of the PM/mortality relationship or the more conservative 1% change in mortality per 1 ug/m3 PM2.5 estimate.  To calculate the % change in mortality rate from the observed change in PM2.5, I multiply (a) by (c).  This number is then multiplied by the baseline monthly mortality rate in (d) to get the total change in the mortality rate, which is then multiplied by the total affected population in each age group (e*f).

age group reduction in monthly pm2.5 % change in monthly mortality rate per 10 ug pm 10 % change in mortality per 1 ug PM2.5 baseline monthly mortality rate per 100,000 for each age group population in each age group (100,000s) percent of population affected two month reduction in mortality

(a) (b) (c) (d) (e) (f) (g)

Using He et al 2016 PM10 estimates in (b)
under 5 -10 20 2.9 17.1 839.3 0.5 -4097
over 70 -10 10 1.4 523.8 981.1 0.5 -73421


Using 1%/ug PM2.5 estimates in (c)
under 5 -10
1.0 17.1 839.3 0.5 -1434
over 70 -10
1.0 523.8 981.1 0.5 -51395


To get the estimates in (a), I observe daily data for four chinese cities back to Jan 1 2016.  I define a COVID-19 treatment dummy equal to one if the year==2020, and then run a panel regression of daily PM2.5 on the treatment dummy plus city-by-day-of-year fixed effects.  The sample is all days in Jan-early March.  Basically this calculates the daily PM2.5 difference between 2020 and 2016-19, averaged across all sites and days.  I can run this on the daily data or on the 7-day running mean, and with city-by-d.o.y fixed effects or city + d.o.y fixed effects and I get answers that are all between 15-18 ug/m3 reductions in PM2.5.  We very conservatively round this number down to 10 in column (a). 

What are some of the key assumptions in my overall analysis?  There are lots:
  • This is only the partial effect of air pollution; it is by no means the overall effect of COVID-19 on mortality.  Indeed, the broader disruption caused by COVID-19 could cause many additional deaths that are not directly attributable to being infected with the virus -- e.g. due to declines in households' economic wellbeing, or to the difficulty in accessing health services for non-COVID illnesses.  Again, I am absolutely not saying that pandemics are good for health.  
  • The key assumption in using the He et al 2016 estimates (or the 1%/1ug conservative summary estimate from quasi-experimental studies) is that changes in outdoor PM2.5 concentrations are a sufficient statistic for measuring health impacts.  One worry is that, instead of being at work, people are staying at home, and that home indoor air pollution is worse than what they would have been exposed to otherwise.  While it is true that indoor air pollution can be incredibly high in homes that rely on biomass burning for cooking and heating, existing evidence suggests that even in cold regions, urban Chinese residents probably have better air quality inside their home than outside it.  E.g see here.  So the key question to me is whether other behaviors changed in response to COVID-19 that made individuals exposures look a lot different than they did in the Beijing setting in He et al 2016.  Maybe there is a case to be made there but not sure what it is.  There's a prima facie case that the Beijing setting looks a lot like what we're worried about here:  an acute 2-month event that led to large but temporary changes in air pollution.  
  • Is mortality from COVID-19 interacting with mortality from PM?  One possibility is that there are enough COVID-19 infections to actually make people more susceptible to the negative impacts of air pollution.  But this would increase rather than decrease deaths from PM!  The other possibility is that the people who have very sadly passed away from COVID would have been those most likely to pass away from PM exposure.  But even if this were 100% true, it would only account for ~5% of the overall predicted mortality.  
  • We are pinning the entire PM2.5 difference between Jan-Feb 2020 and Jan-Feb 2016-19 on COVID-19.  (Note that we estimate a 17ug/m3 difference, and have rounded that down to 10 ug/mg, so in effect are only pinning about 60% of the change on COVID).  Without more careful analysis, we can't really know whether this is fair or not.  Maybe something else very big was going on at exactly the same time in China?  Seems unlikely but my analysis has nothing to say about that. 
  • My estimates are a prediction of mortality impacts, not a measurement.  They are not proof that anything has happened.  In a few years, there will likely be enough data to actually try to measure what the overall effect was of the COVID-19 epidemic on health outcomes.  You could compare changes in mortality in high-exposure locations to changes in mortality in lower-exposure locations, for instance.  But this study is not yet possible, as the epidemic is still underway and the comprehensive all-cause mortality data not yet (to my knowledge) available.