Monday, October 19, 2020

The GDP-temperature relationship - some thoughts on Newell, Prest, and Sexton 2018

Quantifying the potential effects of climate on aggregate economic outcomes like GDP is a key part of understanding the broader economic impacts of a changing climate.  Lots of studies now use modern panel econometrics to study how past fluctuations in temperature have affected GDP around the world, including the landmark 2012 paper by Dell, Jones, and Olken (henceforth DJO), our 2015 paper (henceforth BHM), and a number of more recent papers that either analyze new data on this topic (e.g. subnational GDP or income data, see here or here or here) or revisit earlier results. 

Newell, Prest, and Sexton 2018 (draft here, henceforth NPS) is one recent paper that revisits our earlier work in BHM. Because we've gotten a lot of questions about this paper, Sol and I wanted to share our views of what we think we can learn from NPS.  In short, our view is that their approach is not suitable to the question they are trying to ask, and that their conclusions as stated in their abstract (at least in their current draft) are directly contradicted by their results.  This is important because their conclusions appear to shed important light on the aggregate economic impacts of unmitigated climate change. Unfortunately we do not believe that this is the case. 

NPS seek to take a data-driven approach to resolving a number of empirical issues that come up in our earlier work.  These include: (1) Does temperature affect the level or growth rate of GDP? (2) What is the "right" set of fixed effects or time trends to include as controls in analyzing the effect of temperature on GDP? (3) And what is the "right" functional form to describe the relationship between temperature and GDP changes?  These choices -- particularly (1) and (3) -- are certainly influential in the findings of earlier studies.  If temperature affects growth rates rather than levels, this can imply huge effects of future temperature increases on economic output, as small impacts on growth aggregate over time.  If temperature has a globally nonlinear effect on output, as we argue in BHM, this suggests that both wealthy and poor countries can be affected by changes in temperature -- and not only poor countries, as suggested in earlier analyses.  Resolving these questions is key to understanding the impacts of a warming climate, and its great that papers like NPS are taking them on.

However, we have some serious qualms with the approach that NPS take to answer these questions.  NPS wish to use cross-validation to select which set of the above choices performs "best" in describing historical data.  Cross-validation is a technique in which available data are split between disjoint training and testing datasets, and candidate models are trained on the training data and then evaluated on the held-out test data using a statistic of interest (typically RMSE or r-squared in these sorts of applications).  The model with the lowest RMSE or highest r-squared on test data is then chosen as the preferred model. 

Inference, not prediction.  This technique works great when your goal is prediction.  But what if your goal is causal inference?  i.e, in our case, in isolating variation in temperature from other correlated factors that might also affect GDP?  It's not clear at all that models that perform the best on a prediction task will also yield the right causal results.  For instance, prices for hotel rooms tend to be high when occupancy rates are high, but only a foolish hotel owner would raise prices to increase occupancy (h/t Susan Athey who I stole this example from).  A good predictive model can get the causal story wrong. 

This is clearly relevant in the panel studies of temperature on GDP.  In existing work, great care is taken to choose controls that account for a broad range of time-invariant and time-varying factors that might be correlated with both temperature and GDP.  These typically take the form of unit or time fixed effects and/or time trends. Again the goal in including these is not to better predict GDP but to isolate variation in temperature that is uncorrelated with other factors that could affect GDP, in order to identify the causal effect of temperature on GDP.  The chosen set of controls constitute the paper's "identification strategy", and in this fixed effect setup unfortunately there is no clear data-driven approach -- including cross-validation -- for selecting these controls.  The test for readers in these papers is not:  do the authors predict GDP really well?  It is instead: do the authors plausibly isolate the role of temperature from other confounding factors?

Growth vs level effects.  The main question NPS are asking is whether the causal effect of a temperature shock has "level effects", where the economy is hurt in one year but catches up in the next year, or "growth effects", where the economy is permanently smaller.  This would require an identification strategy, which is what most of the literature has focused on following the method artfully outlined by DJO:  isolate the role of temperature from other confounding factors using choices about fixed effects that most plausibly achieve this unconfounding, and then distinguish growth from level effects by looking at the sum of contemporaneous and lagged effects.  If the sum is zero, this is evidence of level effects, and if it's not zero, evidence of growth effects.   The current manuscript does not have an identification strategy for measuring these lagged effects, and instead is using goodness of fit metrics to draw causal inferences about the magnitude of these effects.  The tool they are using is again not commensurate with the question they are asking. 

Errors in interpretation. These conceptual issues aside, the authors' conclusions in the abstract and introduction of their paper about the results from their cross validation exercise do not appear consistent with their actual findings as reported in the tables.  The authors conclude that "the best performing models have non-linear temperature effects on GDP levels", but the authors demonstrate no statistically distinguishable differences in results between levels and growth models in their main tables (Table 2 and 3, and A1-A4), nor between linear and non-linear models.  This directly contradicts the statement in their abstract. 

To be precise, the authors state in the abstract "The best-performing models have non-linear temperature effects on GDP levels."  But then on page 27 they clearly state: "The MCS ["model confidence sets", or the set of best performing models whose performance is statistically indistinguishable from one another], however, does not discern among temperature functional forms or growth and level effects." This is in reference to Table 2, reproduced below; models in the MCS are denoted with asterisks, and a range of both growth and levels models have asterisks, meaning their performance cannot be statistically distinguished from one another. 

So, again, the paper's abstract is not consistent with its own stated results.  They do find that the model with quadratic time trends (as used by BHM) is outperformed by models without country time trends -- but again, the purpose of those time trends is to remove potential confounding variation, not to perfectly predict GDP. See here for a simple look at the data on whether individual countries growth rates have differential time trends that might make you worried about time-trending unobservables at a country level [spoiler: yes they do].

Errors in implementation.  Even if cross validation was the right tool here, the authors make some non-standard choices in how the CV is implemented which we again believe make the results very hard to interpret.  Instead of first splitting the data between disjoint train and test sets, they first transform the data by regressing out the fixed effects, and then split the residualized data into train and test.  But the remaining variation in the residualized data will be very different based on what set of fixed effects have been regressed out, and this will directly affect resulting estimates of the RMSE.  It is thus not a surprise that models with region-year FE have lower RMSE than models with year FE (in models with no time trends) -- region-year FE's almost mechanically take out more of the variation.  But this means you can't meaningfully compare RMSE's across different fixed effects in the way that they are doing -- you are literally looking at 2 different datasets with 2 different outcomes. You can only in principle compare functional forms within a given choice of FE.  

Imagine the following: you have outcome Y, predictor X, and covariates W and Z.  Z is a confound: correlated with both X and Y.  W is not correlated with Y but X.

In version 1 you partial Z out of both X and Y, and generate residualized values Y_1 and X_1.

In version 2 you partial W out of both X and Y, and generate residualized values Y_2 and X_2. 

This is in effect what NPS do, and then they want us to compare Y_1 = f(X_1) versus Y_2 = f(X_2).  But this clearly doesn't make sense, because Y_2 and X_2 still have the confounding variation of Z in them, and Y_1 and Y_2 are no longer measuring the same thing. So comparing the predictive power of f(X_1) vs f(X_2) is not meaningful.  It is also not how cross validation is supposed to work - instead, we should be comparing predictive performance on the same Y variables. 

Further hazards in cross validation. While we don't agree that CV can be used to select the fixed effects in this setting, we then do agree with NPS that cross validation could in principle be used to identify the "right" functional form that relates temperature to GDP (conditional on chosen controls). e.g is the relationship linear? quadratic? something fancier?  This works because the causal part has already been taken care of by the fixed effects (or so one hopes), and so what's left is to best describe the functional form of how x relates to y.  Cross validation for this purpose has been successfully implemented settings in which temperature explains a lot of the variation in the outcome -- e.g. in studies of agricultural productivity (see Schlenker and Roberts 2009 for an early example in this literature).  

But unfortunately when it comes to GDP, while temperature has a strong and statistically significant relationship to GDP in past work, it does not explain a lot of the overall interannual variation in GDP;  GDP growth is noisy and poorly predicted even by the hotshot macro guys.  In this low r-squared environment, selecting functional form by cross validation can be difficult and perhaps hazardous.  It's too easy to overfit to noise in the training set.

To see this, consider the following simulation in which we specify a known cubic relationship between y and x and then try to use cross validation to recover the "right" order of polynomial, and study our ability to do so as we crank up the noise in y.  We do this a bunch of times, each time training the regression model on 80% of the data and testing on 20%. We calculate RMSE on the test data and compute average % reduction in RMSE relative to a model with only an intercept.  As shown in the figure below, we can mostly reject the linear model but have a lot of trouble picking out the cubic model from anything else non-linear, particularly in settings with the overall explained variation in y is small.  Given that temperature explains <5% of the variation in GDP growth rates (analogous to the far right grouping of bars in each plot), cross validation is going to really struggle to pick the "right" functional form. 

To be clear, the point we're making in this section is just about functional form, not about growth versus levels.  Even for this more narrow task in which cross validation is potentially appropriate, it does not end up being a useful tool because the model overfits to noise in the training set.

This exercise does illustrate an important point, however:  right now the data are consistent with a bunch of potential functional forms that we can't easily distinguish.  We argue in BHM that there is pretty consistent evidence that the temperature/growth relationship is non-linear and roughly quadratic at the global level, but we certainly can't rule out higher order polynomials and we say so in that paper.  

Wrapping up. So where does this leave us?  Techniques like cross validation certainly still have utility for other approaches to causal inference questions (e.g. in selecting suitable controls for treated units in synthetic control settings), and there might be opportunities to apply those approaches in this domain.  Similarly, we fully agree with NPS's broader point that using data-driven approaches to make key analytical decisions in climate impacts (or any other empirical) paper should be our goal.  But in this particular application, we don't think that the specific approach taken by NPS has improved our understanding of climate/economy linkages.  We look forward to working with them and others to continue to make progress on these issues. 

Friday, September 11, 2020

Indirect mortality from recent wildfires in CA

[This post is joint with Sam Heft-Neal]

Wildfire activity in the US West coast has been unprecedented in the last month, with fires burning larger and faster than ever experienced.  Multiple communities in the paths of these fires have been entirely destroyed, and wildfire smoke has blanketed huge swaths of the West Coast for weeks.

While media coverage on fire impacts has justifiably focused on the lives that have immediately been lost to the fire, the total costs in terms of human lives and health is likely far larger, due to the immense amount of smoke that has been inhaled over the last 3 weeks by the very large number of people living on the West Coast.  

How large might these effects be?  We know a lot about how exposure to certain air pollutants -- in particular, PM2.5 -- affects a range of health outcomes, including mortality.  Recent wildfire activity has led to a massive increase in PM2.5 above normal levels.  As anyone who lives in CA or has watched the news knows, air quality has been terrible, and the monitoring data of course bear this out. Below is a plot of the deviation of daily PM2.5 in 2020 from the previous 5yr average, beginning Aug 1 and going through Sep 10th 2020.  This is based on station data from EPA AirNow, and we average over each reporting station in each location on a given day. 

Difference in PM2.5 on each day, 2020 relative to 2015-19 average. Data are from EPA AirNow, and represent averages over all stations reporting in each area.

For most locations in CA, PM2.5 was 10-50ug/m3 higher than normal on many days, which is a massive increase above average (which in the Bay Area is typically ~ 10ug).  For simplicity, we are going to attribute all the excess PM2.5 in Aug/Sept 2020 (relative to 2015-19 average) to wildfires. While this is probably not completely correct, it's not outlandish: the huge observed spikes were definitely due to the wildfires.  Before about Aug 15, PM2.5 was running somewhat below recent averages across much of CA; fires kicked up in mid august, and PM2.5 levels shot through the roof. 

What are the health consequences of this change in exposure? The best paper on the short run mortality and morbidity consequences of PM exposure is a recent (very impressive) paper by Deryugina et al published in the American Economic Review.  Using very detailed Medicare data, they estimate that a +1ug PM2.5 increase on one day increases deaths over the next three days by 0.7 per 1 million people over 65+, and increases ER visits among the elderly by 2.7 per million people.  We emphasize that these result is for all PM2.5, not just PM2.5 from smoke; the literature is still undecided whether the latter has differential health effects, so we will just assume here that the effects are similar. 

For simplicity, let's say total PM2.5 values (excess PM2.5 since Aug 1, summed over days) were 300ug/m3 above normal across CA due to the wildfires. This will be a little too high for LA, but will be way too low for most of NorCal and the central valley. 

There are about 6 million people aged 65+ in CA. Applying the Deryugina et al estimates to the change in PM and the exposed population, we arrive at 1200 excess deaths (deaths that would not have happened otherwise) and 4800 additional ER visits among the elderly.  If we use the Deryugina et al estimates of how much one day of additional PM2.5 increases mortality over the next month (not just next three days), the estimated number of deaths rises to 3000.  This is just in CA alone!  And just for people aged 65+.  Oregon and Washington are being hit very hard right now too, and non-elderly are also surely affected.  So this is likely a substantial lower bound on total health costs. 

There are a number of caveats to this calculation that are important to consider. In most cases they suggest these numbers above could be conservative. Note that we will never know the "true" number because we don't observe the counterfactual -- i.e what would have happened in the absence of the dramatic decline in air quality.  But the purpose of the Deryugina et al paper is to get us as close as we could possibly get to that counterfactual.  Some caveats:

  • Again, we attribute all the change in PM2.5 in 2020 (relative to 5-yr avg) as due to wildfire.  While this is probably not fully correct, the big spikes were definitely from wildfires.  Further analysis will help clarify this. 
  • PM2.5 from wildfire smoke might not have the same effect as overall PM2.5 (only some of which comes from wildfire).  The science isn't clear on this yet.
  • We are in the midst of the COVID pandemic, and this has as-yet-unknown implications for our understanding of how air pollution affects health outcomes. Early evidence seems to suggest that poor air quality could worsen COVID-related outcomes, and if that's the case then our numbers above could be lower bounds.
  • Numbers above only look at elderly mortality and ER visits, and it's likely that other groups are very substantially affected (e.g. the very young, those with pre-existing cardiovascular or respiratory conditions).   Again, above numbers likely a dramatic lower bound on overall health consequences for that reason.
  • We don't know a ton about health responses at super high exposures.  Does it tail off?  Does it get way worse?  You can find papers telling you both.  Right now we're assuming response is linear, which is consistent with a lot of the literature. 
  • We also don't have a full picture of "displacement" - i.e. if someone passes away due to air pollution exposure, maybe they were already extremely sick and so would have passed away in the near term anyway.  Deryugina et al explore this to some degree by looking at 3-day (and longer) responses to a single day of pollution increase.  They find effects of 1-day exposure over the next month to be about twice as big as the 3-day estimates we use, so again this could suggest our numbers are a lower bound. 
  • Effects could also amplify over time, as its probably the case that exposure to 7 days in a row of terrible air is worse that exposure to 7 days spread out.  Our numbers do not consider this, nor do we know of good estimates in the literature that would help with this. Please point them out if so!

These overall effects can be in large part attributable to climate change, which has dramatically increased the likelihood and severity of wildfire.  Understanding this pathway of climate impacts is, we think, underappreciated, and something that we're working on (along with lots of others!).

Monday, August 31, 2020

Seminar talk on Spatial First Differences

Hannah Druckenmiller and I wrote a paper on a new method we developed that allows you to do cross-sectional regressions while eliminating a lot (or all) omitted variables bias. We are really excited about it and it seems to be performing well in a bunch of cases

Since people seem to like watching TV more than reading math, here's a recent talk I gave explaining the method to the All New Zealand Economics Series. (Since we don't want to discriminate against the math-reading-folks-who-don't-want-to-read-the-whole-paper, we'll also post a blog post explaining the basics...)

If you want to try it out, code is here.

Thursday, June 11, 2020

Public database of non-pharmaceutical interventions for COVID-19

This week we published the new paper The effect of large-scale anti-contagion policies on the COVID-19 pandemic which tries to estimate effect of of over 1,700 policies on the day-over-day growth rate of infections. We compiled subnational data across six countries (China, South Korea, Italy, Iran, France, and the USA) and tried to to break down the effects of different overlapping policies.

One result of the paper is that during the period of study, we estimate that there would be roughly 500 million more COVID-19 infections by early April, across these six countries, in the absence of policy. We get to this number by linking up some reduced-form econometric tools from GDP-growth modeling with some SIR/SIER models from epidemiology. If you don't feel like reading a long paper, you can just watch these GIFs of China and Italy (more are here).  Left is what actually happened (area of red circles = cumulative confimred cases), right is a "no policy" simulation:

click to enlarge
click to enlarge

Or if you are more serious about research but still don't want to read the paper, we have two videos explaining the paper here -- one is a presentation at the HELP Seminar, the other is a 3 min summary that involves a lot of figurative and literal arm-waving.

The hardest part of the study was actually putting together an original standardized dataset of all 1,717 NPIs. We have posted all the data here (alongside the code). We are hoping that people will use this new data set for other projects and welcome feedback and/or comments, especially if you think we missed something (much of this was collected by hand, so it is a dataset that will certainly improve over time). The dataset is detailed in the Supplementary Information of the paper. If you use the data, please just cite the article as the source.

We constructed the data set to be at the same spatial resolution of the the most finely resolved case data we could obtain, which was the second administrative level for Italy and China (see figs above), and the first admin level for everyone else. The policies were actually deployed at a variety of admin levels, depending on the country and policy. Here's the table from the appendix summarizing the policies in the data set:

We also worked very hard to standardize policy definitions as much as possible across different countries, e.g. we tried to make "business closure" or "no gathering" mean something similar across different countries, but there is substantial nuance and some things could not be standardized, so please make sure to consult the documentation.

Note that there is also variation in the intensity of policies in the data set. This comes from the fact that many policies are encoded as composites (e.g. there are multiple policies that make up a single policy definition) and also because many policies are not deployed uniformly across an entire administrative unit, so we computed the population-weighted "exposure" to each policy (e.g. if only half the population of a US state adopted a policy, since many are set at the county level, this policy got coded as one-half). Here's a plot of policy intensity over time for the US sample (other plots are available here, but weren't in the paper because we ran out of our figure allotment):

 non-pharmaceutical interventions over time in the USA (click to enlarge) 

Several folks (including a journal referee and someone at the DOD) asked if we planned to keep collecting this data and making it public. In principle, we would love to do that, but the entire team did the whole project pro bono and have since had to go back to our normal jobs. That said, if someone reading this wants to to support/otherwise enable expansion of this data for the public good, you know where to find me, since I won't be going anywhere for a while...

Let us know if you use the data, we'll excited to hear if it's useful.

Sunday, March 8, 2020

COVID-19 reduces economic activity, which reduces pollution, which saves lives.

COVID-19 is a massive global economic and health challenge, having caused >3500 global deaths as of this writing (Mar 8) and untold economic and social disruption.  This disruption is only likely to increase in coming days in regions where the epidemic is just beginning. Strangely, this disruption could also have unexpected health benefits -- and these benefits could be quite large in certain parts of the world.  Below I calculate that the reductions in air pollution in China caused by this economic disruption likely saved twenty times more lives in China than have currently been lost directly due to infection with the virus in that country.

None of my calculations support any idea that pandemics are good for health. The effects I calculate just represent health benefits from the air pollution changes wrought by the economic disruption, and do not account for the many other short- or long-term negative consequences of this disruption on health or other outcomes; these harms likely vastly exceed any health benefits from reduced air pollution.  [Edit: added this paragraph 3/16 to emphasize the lessons from my findings, in case readers don't make it further..]

Okay, on to the calculation.

A few weeks ago, NASA published striking satellite images of the massive reduction in air pollution (specifically, NO2) over China resulting from the economic slow-down in that country following it's aggressive response to COVID-19.

Source: NASA
Separate analyses indeed found that ground-based concentrations of key pollutants -- namely PM2.5 -- fell substantially across much of the country.  These reductions were not uniform.  In northern cities such as Beijing, where much of wintertime pollution comes from winter heating, reductions were absent.  But in more southern cities such as Shanghai and Wuhan where wintertime pollution is mainly from cars and smaller industry, pollution declines appeared to be dramatic.

Given the huge amount of evidence that breathing dirty air contributes heavily to premature mortality, a natural -- if admittedly strange -- question is whether the lives saved from this reduction in pollution caused by economic disruption from COVID-19 exceeds the death toll from the virus itself.  Even under very conservative assumptions, I think the answer is a clear "yes".

Here are the ingredients to doing this calculation [Edit: see additional details I added at end of this post, if really wanting to nerd out].  First we need to know how much the economic disruption caused by COVID-19 in China reduced air pollution in the country.  The estimates already discussed suggest about a 10ug/m3 reduction in PM across China in Jan-Feb of 2020 relative to the same months in the previous 2 years (I get this from eyeballing the figure mid-way down the blog post).  To confirm these results in independent data, I downloaded hourly PM2.5 data from US government sensors in four main cities in China (Beijing, Shanghai, Chengdu, and Guangzhou).  I calculated the daily average (and trailing 7-day mean) of PM2.5 since Jan 2016, and then again compared Jan-Feb 2020 to all other Jan-Feb periods in these cities.

I find results that are highly consistent with the above analysis.  Below is a plot of the data in each city. The blue line is the 7-day average in 2020, and the thick red line is the same average over the years 2016-19 (thin red lines are each individual year in that span).  On average across these four cities, I find an average daily reduction of 15-17ug/m3 PM2.5 across both Jan-Feb 2020 relative to the average in the previous four years.  This average difference is highly statistically significant.  As in the above analysis, Beijing appears to be the outlier, with no decline in PM relative to past years when analyzed on its own.

PM2.5 concentrations in four Chinese cities in Jan-Feb 2016-2019 (red lines) vs same period in 2020 (blue lines). All cities except Beijing show substantial overall reductions during the period. 

To be very conservative, let’s then assume COVID-19 reduced PM2.5 on average by 10ug/m3 for in China, that this effect only lasted 2 months, and that only some urban areas (and no rural areas) experienced this change (see below).

Next we need to know how these changes in air pollution translated into reductions in premature mortality.  Doing this requires knowing three numbers:
  1. The change in the age specific mortality rate per unit change in PM2.5.  For this we use estimates from He et al 2016, who studied the effects of air quality improvements around the Beijing Olympics on mortality rates -- a quasi-experimental setting on air quality not unlike that induced by COVID-19.  He et al found large reductions in mortality for children under five and for adults over 65, with monthly under-5 mortality increasing 2.9% for every 1ug/m3 increase in PM2.5, and about 1.4% for people over 70 years old (note:  they measure impacts using PM10, and to convert to PM2.5 we use the common assumption in the literature that PM2.5 = 0.7*PM10 and that all impacts of PM10 are through the PM2.5 component). Estimates from the He et al paper are broadly consistent with other quasi-experimental studies on the effects of air pollution on infant health, including estimates from the US, Turkey, and Africa (see Fig 5b here).  As a more conservative approach, we also calculate mortality using 1% increase in mortality per 1ug increase in PM2.5 for both kids and older people.  In all cases, we very conservatively assume that no one between the ages of 5 and 70 is affected.  
  2. The baseline age-specific mortality rates for kids and older people, which we take from Global Burden of Disease estimates.  I note a large apparent discrepancy between the GBD numbers, which give an under-5 mortality rate of 205 per 100,000 live births in China in the most recent year, and the WB/Unicef estimate, which puts the number at 860 per 100,000 live births (4x higher!).   I use the much smaller GBD estimate in my calculations to be conservative, but someone should reconcile these...  
  3. The total population affected by the changes in air pollution.  For this we take the total Chinese population as estimated by the UN, and conservatively assume that only 50% of the population are affected by the change in air quality.  Since roughly 60% of Chinese live in urban areas, this is like assuming there is no effect in rural areas and some urban areas remain unaffected. 

Putting these numbers together [see table below for details] yields some very large reductions in premature mortality.  Using the He et al 2016 estimates of the impact of changes in PM on mortality, I calculate that having 2 months of 10ug/m3 reductions in PM2.5 likely has saved the lives of 4,000 kids under 5 and 73,000 adults over 70 in China.  Using even more conservative estimates of 10% reduction in mortality per 10ug change, I estimate 1400 under-5 lives saved and 51700 over-70 lives saved.  Even under these more conservative assumptions, the lives saved due to the pollution reductions are roughly 20x the number of lives that have been directly lost to the virus (based on March 8 estimates of 3100 Chinese COVID-19 deaths, taken from here). 

What's the lesson here?  It seems clearly incorrect and foolhardy to conclude that pandemics are good for health. Again I emphasize that the effects calculated above are just the health benefits of the air pollution changes, and do not account for the many other short- or long-term negative consequences of social and economic disruption on health or other outcomes; these harms could exceed any health benefits from reduced air pollution.  But the calculation is perhaps a useful reminder of the often-hidden health consequences of the status quo, i.e. the substantial costs that our current way of doing things exacts on our health and livelihoods.

Might COVID-19 help us see this more clearly?  In my narrow academic world -- in which most conferences and academic meetings have now been cancelled due to COVID-19, and where many folks are cancelling talk invitations and not taking flights -- it might be a nice opportunity to re-think our production function with regard to travel.  For most (all?) of us academics, flying on airplanes is by far the most polluting thing we do.  Perhaps COVID-19, if we survive it, will help us find less polluting ways to do our jobs.  More broadly, the fact that disruption of this magnitude could actually lead to some large (partial) benefits suggests that our normal way of doing things might need disrupting.

[Thanks to Sam Heft-Neal for helping me double check these calculations; errors are my own].

Edited 3/9 to add additional details below on calculations and assumptions. 

Here are the actual values and math I used, either using He et al 2016 estimates of the PM/mortality relationship or the more conservative 1% change in mortality per 1 ug/m3 PM2.5 estimate.  To calculate the % change in mortality rate from the observed change in PM2.5, I multiply (a) by (c).  This number is then multiplied by the baseline monthly mortality rate in (d) to get the total change in the mortality rate, which is then multiplied by the total affected population in each age group (e*f).

age group reduction in monthly pm2.5 % change in monthly mortality rate per 10 ug pm 10 % change in mortality per 1 ug PM2.5 baseline monthly mortality rate per 100,000 for each age group population in each age group (100,000s) percent of population affected two month reduction in mortality

(a) (b) (c) (d) (e) (f) (g)

Using He et al 2016 PM10 estimates in (b)
under 5 -10 20 2.9 17.1 839.3 0.5 -4097
over 70 -10 10 1.4 523.8 981.1 0.5 -73421


Using 1%/ug PM2.5 estimates in (c)
under 5 -10
1.0 17.1 839.3 0.5 -1434
over 70 -10
1.0 523.8 981.1 0.5 -51395


To get the estimates in (a), I observe daily data for four chinese cities back to Jan 1 2016.  I define a COVID-19 treatment dummy equal to one if the year==2020, and then run a panel regression of daily PM2.5 on the treatment dummy plus city-by-day-of-year fixed effects.  The sample is all days in Jan-early March.  Basically this calculates the daily PM2.5 difference between 2020 and 2016-19, averaged across all sites and days.  I can run this on the daily data or on the 7-day running mean, and with city-by-d.o.y fixed effects or city + d.o.y fixed effects and I get answers that are all between 15-18 ug/m3 reductions in PM2.5.  We very conservatively round this number down to 10 in column (a). 

What are some of the key assumptions in my overall analysis?  There are lots:
  • This is only the partial effect of air pollution; it is by no means the overall effect of COVID-19 on mortality.  Indeed, the broader disruption caused by COVID-19 could cause many additional deaths that are not directly attributable to being infected with the virus -- e.g. due to declines in households' economic wellbeing, or to the difficulty in accessing health services for non-COVID illnesses.  Again, I am absolutely not saying that pandemics are good for health.  
  • The key assumption in using the He et al 2016 estimates (or the 1%/1ug conservative summary estimate from quasi-experimental studies) is that changes in outdoor PM2.5 concentrations are a sufficient statistic for measuring health impacts.  One worry is that, instead of being at work, people are staying at home, and that home indoor air pollution is worse than what they would have been exposed to otherwise.  While it is true that indoor air pollution can be incredibly high in homes that rely on biomass burning for cooking and heating, existing evidence suggests that even in cold regions, urban Chinese residents probably have better air quality inside their home than outside it.  E.g see here.  So the key question to me is whether other behaviors changed in response to COVID-19 that made individuals exposures look a lot different than they did in the Beijing setting in He et al 2016.  Maybe there is a case to be made there but not sure what it is.  There's a prima facie case that the Beijing setting looks a lot like what we're worried about here:  an acute 2-month event that led to large but temporary changes in air pollution.  
  • Is mortality from COVID-19 interacting with mortality from PM?  One possibility is that there are enough COVID-19 infections to actually make people more susceptible to the negative impacts of air pollution.  But this would increase rather than decrease deaths from PM!  The other possibility is that the people who have very sadly passed away from COVID would have been those most likely to pass away from PM exposure.  But even if this were 100% true, it would only account for ~5% of the overall predicted mortality.  
  • We are pinning the entire PM2.5 difference between Jan-Feb 2020 and Jan-Feb 2016-19 on COVID-19.  (Note that we estimate a 17ug/m3 difference, and have rounded that down to 10 ug/mg, so in effect are only pinning about 60% of the change on COVID).  Without more careful analysis, we can't really know whether this is fair or not.  Maybe something else very big was going on at exactly the same time in China?  Seems unlikely but my analysis has nothing to say about that. 
  • My estimates are a prediction of mortality impacts, not a measurement.  They are not proof that anything has happened.  In a few years, there will likely be enough data to actually try to measure what the overall effect was of the COVID-19 epidemic on health outcomes.  You could compare changes in mortality in high-exposure locations to changes in mortality in lower-exposure locations, for instance.  But this study is not yet possible, as the epidemic is still underway and the comprehensive all-cause mortality data not yet (to my knowledge) available. 

Tuesday, August 27, 2019

On the efficacy of the "sniff test" for understanding climate impacts

Economists (and other high-statured people) often pride themselves on having good noses.  When encountering a research finding, they can often apply a quick "sniff test" to evaluate whether the results of this finding are likely true or false.

We encounter the sniff test frequently when we discuss our estimates of the potential impacts of future warming on domestic or global economic output. Economic orthodoxy has has long held that 3-4C of warming will reduce global GDP by a few percentage points over the next century relative to a world were temperature was held fixed.  These estimates are based largely on earlier cross-sectional studies relating average output to average temperature, and have provided critical input to the construction of "damage functions" used in benchmark integrated assessment models.  Output from these models, in turn, have played a pivotal role in policy-making around the social cost of carbon (e.g. see here).

Newer analysis done by your G-FEED bloggers and others using panel based methods have found much larger potential effects of warming on output (e.g. see here), e.g. ~20% loss of global GDP in a +4C world, relative to a world where temperature had stayed fixed.  Because these estimates are so much larger than previous estimates, they have frequently failed the sniff test, both publicly and in referee reports we've received.  Some people have told us the results just seem "too big".  No way can climate change make us 20% poorer than we would have been without it.

Rather than adjudicating the methods underlying these newer results, I want to evaluate this sniff test itself by simply looking at some historical data.  This is in the spirit of Sol's last post: before we argue so strongly for a particular approach or result based on our priors, how bout we just look at some data.  Is there historical evidence for some regions performing 20% better than others over century or shorter time scales?

Exhibit 1:  let's just look at income growth in US states.  Below is a plot of real per capita income growth since 1980 in selected US states, with the y-axis normalized to show the % change in income relative to 1980.  Even in this relatively short 35-year time span (a third of a century, for those scoring at home), there is huge variation in performance: income in Massachusetts grew >130% over this period, while incomes in Nevada grew only 30%.  Even the difference between coastal-elite Massachusetts and California over this 35-year period exceeds 20%.  So variation in performance of around 20% tis clearly within our very recent historical experience in this country, on a time scale much shorter than a century.
Data: Bureau of Labor Statistics

Ah, you say, but this is just the US, and we know we have rampant inequality in this country, etc etc.  This variation can't be representative of the rest of the world, right?

Okay:  below is the plot for a small sample of OECD countries, using real GDP/cap from World Bank WDI.  Again, plotting data since only 1980, there's been a ton of variation in observed growth in per capita incomes, with neighboring high-income countries varying by way more than 20% in how much they've grown.  The plot is of course way more stark if you include developing countries, with some countries only growing a few percentage points over the period, and others (e.g. China) growing by >1000% (and screwing up the axes of any plot you try to make...).  So:  massive variation at the country level within just a few decades.
Data: World Bank World Development Indicators

Ah, you say, this is just ~35 years of data -- way too short a time scale to look at these long run trends.  Can't possibly be true for longer time scales, right? Don't countries converge on longer time scales?

Okay:  below is a plot of real per capita growth rates over multiple centuries, based on the Maddison data, for a random selection of countries with data back to 1800.  At this time scale, you see absolutely gargantuan differences in changes in per capita incomes.  German incomes grew nearly 5000% since 1800, Portuguese incomes grew 2000%, while South African incomes grew only about 500%.
Data: Maddison Project
These century scale differences in income over time are roughly two orders of magnitude larger than what our climate impact estimates would predict to be the difference between unmitigated climate change, and a world where warming didn't happen.

So in light of these historical data, can we please do away with the vaunted "sniff test"?  Our estimates or climate impacts are small relative to the variation in historical performance we've seen, and that's true whether you look within countries over recent decades, or (even more so) across countries over very long time periods.  None of this, of course, tells us whether climate change will reduce economic output by 20%;  it just tells us that we cannot reject a 20% estimate out of hand for being "way too big".

But this begs the question:  why are our noses so bad?  Why is the sniff test so un-efficacious? Perhaps we continue to under-appreciate the power of compounding.  A key result from our earlier paper (building on key earlier work by Dell, Jones, and Olken, and shown by multiple other groups as well, see here, here, here), is that changes in temperature can affect the growth rate of GDP.  And small effects on the growth rate can have large effects on the level of income over long time scales.

E.g. consider an economy growing at 2% for a century (the US, roughly).  Knocking only a quarter of a percent off the growth rate -- i.e. growing at 1.75% instead of 2% -- leads to an economy that's >20% poorer than it would have been after 100 years.  Very small growth effects can lead to large level effects over long times scales.

Another analogy (h/t to my colleague and co-author Noah Diffenbaugh) is the continued success of managed mutual funds, despite the clear evidence that the small management fees charged by these funds leads to huge reductions in your accumulated investment over longer time scales, relative to investing in index funds with no fees.  Just as small fees eat away at your total portfolio earnings, small hits to the growth rate cumulate into large level effects down the line.  But the mutual fund industry in the US is $18 trillion (!!).

So let's continue to argue about methodological issues and about how best to understand the response of aggregate output to warming.  More posts from us on that soon.  But, please, can we stop applying the sniff test to these results?  Our noses work less well than we think.

Monday, August 26, 2019

Do GDP growth rates have trends? Evidence from GDP growth data

Over the years, there have a been a handful of critiques of some of our prior research on GDP growth centering on the concern that we model country-specific time trends in growth rates. This concern seems to be experiencing a resurgence, due in part to an RFF working paper by Richard Newell, Brian Prest, and Steven Sexton which claims that including trends in GDP growth regressions is "overfitting" (I'll probably post about this paper in greater depth later). In addition, Richard Rosen recently posted a PNAS comment asserting that this approach was obviously flawed, and Richard Tol has been tweeting something similar.

So is it obviously wrong to account for trends in growth rates when looking at GDP growth data?  One way to settle this would be to look at some GDP growth data.

Grabbing the GDP growth data from our 2015 paper, I write a single (very boring) command that regresses GDP per capita growth rates for the USA on time and a constant (for the period 1960-2013).
reg TotGDPgrowthCap year if iso == "USA"
The result is seeing that GDP growth for the USA has been declining by a steady 0.043% per year per year (p-value = 0.018):

So it certainly seems that growth is unlikely to be constant over time. (Remember that a trend in growth rates means that log(income) over time is not linear, but curved.) Newell et al. advocate for treating the above time series as if it is centered around a constant (not a downward trend), which seems at odds with the data itself, the macro-economic theory of convergence, as well as the findings of the Newell et al analysis itself (more on that in the later post). Nowhere in the Newell et al. paper, the Rosen comment, or the Tol tweets did the authors look at the data and ask if there was statistical evidence of a trend.

The modeling approach in our previous work goes further than this, however. For example, in our 2015 paper we allow each country to exhibit a growth rate that has a (potentially) quadratic country-specific time trend.  This modeling choice was not arbitrary, but actually resulted from looking at our data fairly carefully.

First, countries sometimes have trends in growth that are not linear. As shown above, the trend in USA growth is extremely linear, so in a quadratic model the coefficient on the quadratic term will basically be zero and have no effect on the model (it's precisely -2.89e-06 if you try it with the USA data). But in other cases, the trend is clearly not linear. Take China for example: economic policies in China have changed dramatically over the last several decades, and the rate of these changes has itself not been steady, so the rate of changes in growth rates might itself change (implying curvature in a trend). This turns out to be reasonable intuition, and the quadratic term in a China-GDP time series regression proves to be nonzero and significant.

Second, different countries have different trends in growth, so it's important that trends in growth are country-specific. This too is pretty obvious if one looks at GDP growth data. For example, India's growth rate has been increasing over time (0.1% per year per year, p-value < 0.000) while the USA's growth rate has been declining (-0.043% per year per year, p-value = 0.018). A simple F-test strongly rejects the hypothesis that they are the same trend (p-value < 0.000). Newell et al. and others seems to advocate for models that assume trends are the same across all (or many) countries without testing if there is statistical evidence that these trends are different. Halvard Buhaug made a similar error in his critique of some of Marshall and David's work on climate and conflict (Kyle Meng and I demonstrated the single-command F-test Buhaug should have used in a PNAS paper).  Here are plots from four major economies demonstrating the diversity of trends that show up very clearly in the data:

Taken together, the data suggest that there is strong evidence that (1) there are nonzero trends in GDP growth, (2) these trends are not the same across countries, (3) these trends are not linear in many cases.

The safest way to deal with these facts (at least in the climate-economy contexts that we study) is to allow trends in growth rates to be modeled flexibly so that they can be different across countries and they can be curved if the data suggest they should be. There are multiple ways to do that, and country-specific quadratic trends is one straightforward way to do it. If the trends are actually linear, they will be modeled as basically linear (statistical software may actually drop the quadratic term if trends are linear enough). And if they are are the same across countries, they will be estimated to be the same. But if they are in fact different across countries, or curved over time, the model will not erroneously assume they are homogenous and constant.

I think the meta-moral of the story is to simply look at your data and listen to what it is telling you.  F-tests and the "plot" command are both compelling and often underutilized.

Tuesday, June 18, 2019

Congressional testimony on economic consequences of climate change

Since it's basically a blog post, here's the oral testimony I gave last week to the House Budget Committee during the Hearing on "The Costs of Climate Change: Risks to the U.S. Economy and the Federal Budget." If you have hours to spare, you can watch the whole hearing or read the full (referenced) written testimony. It's not every day that someone asks you summarize a decade of research progress from your field in 800 words, so it seems worth documenting it somewhere.
Thank you Chairman Yarmuth, Ranking Member Womack, and members of the Committee for inviting me to speak today. 
My name is Solomon Hsiang, and I am the Chancellor’s Professor of Public Policy
at the University of California, Berkeley and currently a Visiting Scholar at Stanford. I was trained in both economics and climate physics at Columbia, MIT, and Princeton. My research focuses on the use of econometrics to measure the effect of the climate on the economy. 
The last decade has seen dramatic advances in our understanding of the economic value of the climate.  Crucially, we now are able to use real-world data to quantify how changes in the climate cause changes in the economy. This means that in addition to being able to project how unmitigated emission of greenhouse gasses will cause the physical climate to change, we can now also estimate the subsequent effect that these changes are likely to have on the livelihoods of Americans. 
Although, as with any emerging research field, there are large uncertainties and much work remains to be done. Nonetheless, I’d like to describe to you some key insights from this field regarding future risks if past emissions trends continue unabated. 
First, climate change is likely to have substantial negative impact on the US economy.  Expected damages are on the scale of several trillions of dollars, although there remains uncertainty in these numbers. For example, in a detailed analysis of county-level productivity, a colleague at University of Illinois and I estimated that the direct thermal effects alone would likely reduce incomes nation-wide over the next 80 years, a loss valued at roughly 5-10 trillion dollars in net present value. In another analysis, a colleague from University of Chicago and I computed that losses from intensified hurricanes were valued at around 900 billion dollars. Importantly, these numbers are not a complete accounting of impacts and other notable studies report larger losses. 
Second, extreme weather events are short-lived, but their economic impact is long-lasting.  Hurricanes, floods, droughts, and fires destroy assets that took communities years to build.  Rebuilding then diverts resources away from new productive investments that would have otherwise supported future growth. For example, a colleague at Rhodium Group and I estimated that Hurricane Maria set Puerto Rico back over two decades of progress; and research from MIT indicates that communities in the Great Plains have still not fully recovered from the Dustbowl of the 1930s. As climate change makes extreme events more intense and frequent, we will spend more attention and more money replacing depreciated assets and repairing communities. 
Third, the nature and magnitude of projected costs differs between locations and industries.  For example, extreme heat will impose large health, energy, and labor costs on the South; sea level rise and hurricanes will damage the Gulf Coast; and declining crop yields will transform the Plains and Midwest. 
Fourth, because low income regions and individuals tend to be hurt more, climate change will widen existing economic inequality.  For example, in a national analysis of many sectors, the poorest counties suffered median losses that were 9 times larger than the richest.  
Fifth, many impacts of climate change will not be felt in the marketplace, but rather in homes where health, happiness, and freedom from violence will be affected.  There are many examples of this. Mortality due to extreme heat is projected to rise dramatically.  Increasingly humid summers are projected to degrade happiness and sleep quality. Research from Harvard indicates that warming will likely elevate violent crime nationwide, producing over 180,000 sexual assaults and over 22,000 murders across eight decades. Colleagues at Stanford and I estimate that warming will generate roughly 14,000 additional suicides in the next thirty years. Increasing exposure of pregnant mothers to extreme heat and cyclones will harm fetuses for their lifetime. These impacts do not easily convert to dollars and cents, but they still merit attention. 
Sixth, populations across the country will try to adapt to climate change at substantial cost.  Some adaptions will transform jobs and lifestyles, some will require constructing new defensive infrastructure, and some will involve abandoning communities and industries where opportunities have deteriorated. In all cases, these adaptations will come at real cost, since resources expended on coping cannot be invested elsewhere. 
Lastly, outside of the US, the global consequences of climate change are projected to be large and destabilizing.  Unmitigated warming will likely slow global growth roughly 0.3 percentage points and reduce political stability throughout the tropics and subtropics. 
Together, these findings indicate that our climate is one of the nation’s most important economic assets. We should manage it with the seriousness and clarity of thought that we would apply to managing any other asset that also generates trillions of dollars in value for the American people. 
Thank you.

Monday, November 5, 2018

The SHCIT List

Just like George Lucas, I write my literature reviews out of order. But I'm happy to say that after several years of messing around in this field, in collaboration with great coauthors, I've finally finished the tetralogy that I've always wanted to complete.  The latest installment (Episode I) just came out in the JEP (it's designed to be a soft on-ramp for economists who are unfamiliar with climate change science to get acquainted with the problem).

Sol Hsiang's Climate Impacts Tutorial reading list:

  1. An Economist’s Guide to Climate Change Science  (what is the physical problem?)
  2. Using Weather Data and Climate Model Output in Economic Analyses of Climate Change (how do we look at the data for that problem?)
  3. Climate Econometrics (how does one analyze that data to learn about the problem?)
  4. Social and Economic Impacts of Climate (what did we learn when we did that?)

This addition completes the box set that can get any grad student up to speed on the broader climate impacts literature.

I hope this is helpful. I think I'm going to go and do more research on elephant poaching now...

Thursday, August 23, 2018

Let there be light? Estimating the impact of geoengineering on crop productivity using volcanic eruptions as natural experiments (Guest post by Jonathan Proctor)

[This is a guest post by Jonathan Proctor, a Doctoral Fellow at the Global Policy Lab and PhD candidate in the Ag and Resource Econ department here at Berkeley]

On Wednesday I, and some notorious G-FEEDers, published a paper in Nature exploring whether solar geoengineering – a proposed technology cool the Earth by reflecting sunlight back into space—might be able to mitigate climate-change damages to agricultural production. We find that, as intended and previously described, the cooling from geoengineering benefits crop yields. We also find, however, that the shading from solar geoengineering makes crops less productive. On net, the damages from reduced sunlight wash out the benefits from cooling, meaning that solar geoengineering is unlikely to be an effective tool to mitigate the damages that climate change poses to global agricultural production and food security. Put another way, if we imagine SRM as an experimental surgery, our findings suggest that the side effects are as bad as the cure.

Zooming out, solar geoengineering is an idea to cool the earth by injecting reflective particles --usually precursors to sulfate aerosols -- into the high atmosphere. The idea is that these particles would bounce sunlight back into space and thus cool the Earth, similarly to how you might cool yourself down by standing in the shade of a tree during a hot day. The idea of such sulfate-based climate engineering was, in part, inspired by the observation that the Earth tends to cool following massive volcanic eruptions such as that of Pinatubo in 1991, which cooled the earth by about half a degree C in the years following the eruption.

Our visualization of the stratospheric aerosols (blue) that scattered light and shaded the planet after the eruption of Mount Pinatubo in 1991. Each frame is one month of data. Green on the surface indicates global crop lands. (The distance between the aerosol cloud and the surface is much larger than in real life.) 

A major challenge in learning the consequences of solar geoengineering is that we can’t do a planetary-scale experiment without actually deploying the technology. (Sol’s questionably-appropriate analogy is that you can’t figure out if you want to be a parent through experimentation.) An innovation here was realizing that we could learn about the impacts of solar geoengineering without incurring the risks of an outdoor experiment by using giant volcanic eruptions as natural experiments. While these eruptions are not perfect proxies for solar geoengineering in every way, they give us the necessary variation we need in high atmosphere aerosol concentrations to study some of the key effects on agriculture. (We expand on how we account for the important differences between the impacts of volcanic eruptions and solar geoengineering on agricultural production later in this post). This approach builds on previous work in the earth science community which has used the eruptions to study solar geoengineering’s impact on climate.  Here’s what we found:

Result 1: Pinatubo dims the lights

First, we find that the aerosols from Pinatubo had a profound impact on the global optical environment. By combing remotely sensed data on the eruption’s aerosol cloud with globally-dispersed ground sensors of solar radiation (scraped from a Russian website that recommends visitors use Netscape Navigator) we estimate that the Pinatubo eruption temporarily decreased global surface solar radiation (orange) by 2.5%, reduced direct (i.e. unscattered, shown in yellow) insolation by 20% and increased diffuse (i.e. scattered, shown in red) sunlight by 20%.

Effect of El Chichon (1982) and Mt Pinatubo (1991) on direct (yellow), diffuse (red) and total (orange) insolation for global all-sky conditions.

These global all-sky results (i.e. the average effect on a given day) generalize previous clear-sky estimates (the effect on a clear day) that have been done at individual stations. Like a softbox or diffusing sheet in photography, this increase in diffuse light reduced shadows on a global scale. The aerosol-scattering also made redder sunsets (sulfate aerosols cause a spectral shift in addition to a diffusion of light), similar to the volcanic sunsets that inspired Edvard Munch’s “The Scream.” Portraits and paintings aside, we wanted to know: how did these changes in sunlight impact global agricultural production?

Isolating the effect of changes in sunlight, however, was a challenge. First, the aerosols emitted by the eruption alter not only sunlight, but also other climatic variables such as temperature and precipitation, which impact yield. Second, there just so happened to be El Nino events that coincided with the eruptions of both El Chichon and Pinatubo. This unfortunate coincidence has frustrated the atmospheric science community for decades, leading some to suggest that volcanic eruptions might even cause El NiƱos, as well as the reverse (the former theory seems to have more evidence behind it).

To address the concern that volcanoes affect both light and other climatic conditions, we used a simple “condition on observables” design – by measuring and including potential confounds (such as temperature, precipitation and cloud cover) in the regression we can account for their effects. To address the concurrent El Nino, we do two things. First, we directly condition on the variables though which an El Nino could impact yields – again temperature, precipitation and cloud cover. Second, we condition on the El Nino index itself, which captures any effects that operate outside of these directly modeled channels. Essentially, we isolate the insolation effect by partitioning out the variation due to everything else – like looking for your keys by pulling everything else out of your purse.

The above figure schematically illustrates our strategy. The total effect (blue) is the sum of optical (red) and climatic components (green). By accounting for the change in yields due to the non-optical factors, we isolate the variation in yields due to stratospheric aerosol-induced changes in sunlight.

Result 2: Dimming the lights decreases yields

Our second result, and the main scientific contribution, is the finding that radiative scattering from stratospheric sulfate aerosols decreases yields on net, holding other variables like temperature constant. The magnitude of this impact is substantial – the global average scattering from Pinatubo reduced C4 (maize) yields by 9.3% and C3 (soy, rice and wheat) yields by 4.8%, which is two to three times larger than the change in total sunlight. We reconstruct this effect for each country in the figure below:

Each line represents one crop for one country. These are the reconstructed yield losses due to the estimated direct optical effects of overhead aerosols.

My young cousins dismissed the sign of this effect as obvious – after all plants need sunlight to grow. But, surprising to my young cousins, the prevailing wisdom in the literature tended to be that scattering light should increase crop growth (Sol incessantly points out that David even said this once). The argument is that the reduction in yields from loss of total light would be more than offset by gains in yield through an increase in diffuse light. The belief that diffuse light is more useful to plants than direct light stems from both the observation that the biosphere breathed in carbon dioxide following the Pinatubo eruption and the accompanying theory that diffusing light increases plant growth by redistributing light from the sun-saturated leaves at the top of the canopy to the light-hungry leaves below. Since each leaf has diminishing photosynthetic productivity for each incremental increase in sunlight, the theory argues, a more equal distribution of light should promote growth.

Aerosols scatter incoming sunlight, which more evenly distributes sunlight across the leaves of the plant. We test whether the loss of total sunlight or the increase in diffuse light from aerosol scattering has a stronger effect on yield.

While this “diffuse fertilization” appears to be strong in unmanaged environments, such as the Harvard Forest where the uptake of carbon increased following the Pinatubo eruption, our results find that, for agricultural yields, the damages from reduced total sunlight outweigh the benefits from a greater portion of the light being diffuse.

We cannot tell for sure, but we think that this difference between forest and crop responses could be due to either their differences in geometric structure (which could affect how deeply scattered light might penetrate the canopy):

Or to a re-optimization towards vegetative growth at the cost of fruit growth in response to the changes in light:

Two radishes grown in normal (left) and low light (right) conditions.  Credit: Nagy Lab, University of Edinburgh

This latter re-optimization may also explain the relatively large magnitude of the estimated effect on crop yield.

Result 3: Dimming damages neutralize cooling benefits from geo

Our final calculation, and the main policy-relevant finding of the paper, is that in a solar geoengineering scenario the damages from reduced sunlight cancel out the benefits from warming. The main challenge here was figuring out how to apply what we learned from volcanic eruptions to solar geoengineering, since the climate impacts (e.g. changes in temperature, precipitation or cloud cover) of a short-term injection of stratospheric aerosols differ from those of more sustained injections (e.g. see here and here). To address this, we first used an earth system model to calculate the impact of a long-term injection of aerosols on temperature, precipitation, cloud cover and insolation (measured in terms of aerosol optical depth). We then apply our crop model that we trained on the Pinatubo eruption (which accounts for changes in temperature, rainfall, cloud cover, and insolation independently) to calculate how these geoengineering-induced changes in climate impact crop yields. This two-step process allows us to disentangle the effects of solar geoengineering on climate (which we got from the earth system model) and of climate on crops (which we got from Pinatubo). Thus, we can calculate the change in yields due to a solar geoengineering scenario even though volcanic eruptions and solar geoengineering have different climatic fingerprints. Still, as with any projection to 2050, caveats abound such as the role of adaptation, the possibility of optimized particle design, or the possibility that variables other than sunlight, temperature, rainfall and cloud cover could play a substantial role.

Estimated global net effect of a geo-engineering on crop yields through four channels (temperature, insolation, cloud cover, precipitation) for four crops. The total effect is the sum of these four partial effects.

So, what should we do? For agriculture, our findings suggest that sulfate-based solar geoengineering might not work as well as previously thought to limit the damaging effects of climate change. However, there are other sectors of the economy that could potentially benefit substantially from geoengineering (or be substantially damaged, we just don’t know). To continue the metaphor from earlier, just because the first test of an experimental surgery had side effects for a specific part of the human body does not mean that the procedure is always immediately abandoned. There are many illnesses that are so harmful that procedures known to cause side effects are sometimes still worth the risk. Similarly, research into geoengineering should not be entirely abandoned because our analysis demonstrated one adverse side effect, there may remain good reasons to eventually pursue such a strategy despite some known costs. With careful study, humanity will eventually gain a better understanding of this powerful technology. We hope that the methodology developed in this paper might be extended to study the effects of sulfate aerosol injection on ecosystem or human health and would be open to collaborate on future studies. Thanks for reading, and I’m excited to hear any thoughts the community may have.