Thursday, June 9, 2016

The Planetary Geometry of Extreme Climate Migration

Adam Sobel and I have a new paper on Potentially Extreme Population Displacement and Concentration in the Tropics Under Non-Extreme Warming.

The paper is a theoretical exercise making a simple point: tropical populations that migrate to maintain their temperature will generally have to move much further that their counterparts in middle latitudes. This results from well established atmospheric physics that aren't focused on by the researchers who usually think about climate change and migration. One reason this result is interesting is that, in a sense, the physics of the tropical atmosphere "amplify" the impact of warming (at least in this dimension) so that even small climate changes can produce really large population displacements in the tropics (theoretically). Here's the story...

Monday, May 16, 2016

Do guns kill people or do people kill people? An economist's perspective

Marshall and I think a lot about the economics of violence, so when the debate about gun control started on House of Cards, I started considering the classic,
"Guns don't kill people, people kill people." 
I mulled over this a bit and got in a debate with my wife since it didn't seem immediately obvious to me what this statement meant and whether it was testable.

I think the economist's take on the statement is that guns are a "technology" used to "produce" murder. This framing made it easier for me to think about what people meant in a way that had some testable predictions.

If "people kill people," that means there are other technologies out there that are pretty similar to guns and can easily be used to produce murder. In econospeak: guns have substitutes. This could happen for two reasons. First, either there are other technologies that produce murder at similar cost, where costs include both the psychological burden of committing murder and the convenience of gun technology, in addition to actual dollar costs. Alternatively, there might be technologies out that that are much more costly than guns (think: committing murder with a knife probably has different psychological costs), but the "demand" for murder is so high that people are willing to use those much more costly technologies to get the job done, i.e. demand for murder is inelastic. If this is true, then raising the cost of gun technology (e.g. strengthening gun control measures) won't save lives since people will be so motivated to produce murder they'll just use the alternative technologies, regardless of whether they they have to pay a higher cost of doing so.

If "guns kill people," I think that means gun technology is so much better than the next closest substitute that simply the presence of guns affects the likelihood that murder is produced. In order for this to happen, it seems you need two conditions to hold. (1) People would have to be willing to commit murder if they can use a gun, but not if they can't (demand for murder would have to elastic) and (2) gun technology must substantially lower the cost of committing murder relative to the closest substitute (seems likely to me). If these conditions are true, then one could effectively reduce the total production of murder by raising the cost of using gun technology, forcing people to use costly alternative technologies. If demand for murder is elastic, then some marginal would-be-murderers will find it's not worth the trouble and lives will be saved.

This led me to an alternative framing of the original statement, which seemed much more tractable to me:
"Are guns and knives substitutes?"
My curiosity piqued, I stayed up late hunting down some data to see if there were any obvious patterns.

I found the FBI Uniform Crime Reports, which tabulate homicides by weapon used for each state in Table 20. I created a state-by-year panel for 2005-2014, converting homicide counts into rates with census data, which goes up until 2010. I needed a measure of the "cost" of gun technology, and after a little hunting around (there is amazingly little data on gun-related anything) found Okoro et al (Pediatrics, 2005) which estimated the fraction of homes with a loaded firearm present using a randomized survey.  Obviously, there is a lot of other stuff that goes into making gun use costly, but it seemed to me that having a loaded gun in the house would dramatically lower the cost of using the gun for murder compared to the alternative situation where a gun had to be located and obtained prior to the act. The data set I put together is posted here in case you want to mess around with it.  (I imagine an enterprising graduate student can analyze the time series variation in this data, but I ended up collapsing this to a cross-section for simplicity.)

The first thing I did was to plot the homicide rate (where a gun was the weapon used) for each state against the fraction of homes with a loaded weapon present. Maybe someone has done this before, but I was struck by the plot:

The slope of the fitted line is 0.18, implying that a 1% increase in homes with a loaded firearm is associated with and average 0.18 additional annual gun murders per 100,000 state residents. This might not be a causal relationship, it's just a correlation. But the the idea that "guns don't kill people, people kill people" has the testable prediction that "guns and knives are substitutes." If this is true, then we would expect that in the states where guns are less accessible, people simply commit murder with other weapons. If we assume for the moment that people in all different states have similar demand for murder, then the substitutability of guns and other weapons would lead us to expect to see this:

Monday, April 25, 2016

A decade of advances in climate econometrics

I have a new working paper out reviewing various methods, models, and assumptions used in the econometrics literature to quantify the impact of climatic conditions on society.  My colleagues here on G-FEED have all made huge contributions, in terms of methodological innovations, and it was really interesting and satisfying to try and tie them all together.  The process of writing this was much more challenging than I expected, but rereading it makes me feel like we as a community really learned a lot during the last decade of research. Here's the abstract:
Climate Econometrics (forthcoming in the Annual Reviews
Abstract: Identifying the effect of climate on societies is central to understanding historical economic development, designing modern policies that react to climatic events, and managing future global climate change. Here, I review, synthesize, and interpret recent advances in methods used to measure effects of climate on social and economic outcomes. Because weather variation plays a large role in recent progress, I formalize the relationship between climate and weather from an econometric perspective and discuss their use as identifying variation, highlighting tradeoffs between key assumptions in different research designs and deriving conditions when weather variation exactly identifies the effects of climate. I then describe advances in recent years, such as parameterization of climate variables from a social perspective, nonlinear models with spatial and temporal displacement, characterizing uncertainty, measurement of adaptation, cross-study comparison, and use of empirical estimates to project the impact of future climate change. I conclude by discussing remaining methodological challenges.
And here are some highlights:

1. There is a fairly precise tradeoff in assumptions that are implied when using cross-sectional vs. panel data methods. When using cross-sectional techniques, one is making strong assumptions about the comparability of different units (the unit homogeneity assumption). When using panel based within-estimators to understand the effects of climate, this unit homogeneity assumption can be substantially relaxed but we require a new assumption (that I term the marginal treatment comparability assumption) in order to say anything substantive about the climate. This is the assumption that small changes in the distribution of weather events can be used to approximate the effects of small changes in the distribution of expected events (i.e. the climate).  In the paper, I discuss this tradeoff in formal terms and talk through techniques for considering when each assumption might be violated, as well as certain cases where one of the assumptions can be weakened.

2. The marginal treatment comparability assumption can be directly tested by filtering data at different frequencies and re-estimating panel models (this is intuition between the long-differences approach that David and Marshall have worked on). Basically, we can think of all panel data as superpositions of high and low-frequency data and we can see if relationships between climate variables and outcomes over time hold at these different frequencies.  If they do, this suggests that slow (e.g. climatic) and fast (e.g. weather) variations have similar marginal effects on outcomes, implying that marginal treatments are comparable.  Here's a figure where I demonstrate this idea by applying to US maize (since we already know what's going on here).  On the left are time series from an example county, Grand Traverse, when the data is filtered at various frequencies. On the right is the effect of temperature on maize using these different data sets.  The basic relationship recovered with annual data holds at all frequencies, changing substantially only in the true cross-section, which is the equivalent of frequency = 0 (this could be because that something happens in terms of adaptation at time scales longer than 33 yrs, or that the unit-homogeneity assumption fails for the cross-section).

3. I derive a set of conditions when weather can exactly identify the effect of climate (these are sufficient, but not necessary).  I'm pretty sure this is the first time this has been done (and I'm also pretty sure I'll be taking heat for this over the next several months).  If it is true that
  • the outcome of interest (Y) is the result of some maximization, where we think that Y is the maximized value (e.g. profits or GDP),
  • all adaptations to climate can take on continuous values (e.g. I can chose to change how much irrigation to use by continuous quantities),
  • the outcome of interest is differentiable in these adaptations, 
then by the Envelope Theorem the marginal effect of weather variation on Y is identical to the marginal effect of identically structured changes in climate (that "identically structured" idea is made precise in the paper, Mike discussed this general idea earlier). This means that even though "weather is not equal to climate," we still can have that "the effect of weather is equal to the effect of climate" (exactly satisfying the marginal treatment comparability assumption above). If we're careful about how we integrate these marginal effects, then we can use this result to construct estimated effects of non-marginal changes in climate (invoking the Gradient Theorem), which is often what we are after when we are thinking about future climate changes.

Wednesday, April 13, 2016

Very G-FEED-y jobs

We're hiring!

The Global Policy Lab at UC Berkeley is now hiring multiple positions for a major research project at the nexus of environmental resource management, economic development, and econometric modeling. 

All job postings are open and applications will be under review immediately.

Positions available:

1. Post doc - for applicants with a PhD
2. Project manager - for applicants with a Masters or PhD
3. Research analyst - for applicants with a Masters or Bachelor's degree 

All positions are full time, start date approx.: 5 or 6/2016

See job descriptions and application instructions at:

Tuesday, March 1, 2016

Testable predictions

As our blog wakes up from an apparent winter hibernation, I’ve been thinking about some predictions for the spring that came out last fall. One on my mind, of course, is that the Warriors will dominate the NBA. That one seems a pretty safe bet at this point.

A more daring bet was one put out last November on the corn season in 2016 by Weather Trends 360. It’s a pretty entertaining video, which I link to below. I'm not sure it's convincing, but I have to give them credit for making a very testable series of predictions. Specifically:

1) An unusually wet March and April, leading to delayed planting
2) A widespread freeze in early May
3) Very high temperatures and low rainfall in June and July, leading to low yields
4) Corn prices of $7 per bushel by July 2016 (or roughly double what they are right now)

So let’s see what happens. These are very specific and, in some cases, pretty bold predictions. If they turn out to be right, I’ll definitely be taking notice next year. Meanwhile, back to enjoying the Warriors.

Monday, December 21, 2015

From the archives: Friends don't let friends add constants before squaring

I was rooting around in my hard-drive for a review article when I tripped over this old comment that Marshall, Ted and I drafted a while back.

While working on our 2013 climate meta-analysis, we ran across an interesting article by Ole Thiesen at PRIO where he coded up all sorts of violence at a highly local level in Kenya to investigate whether local climatic events, like rainfall and temperature anomalies, appeared to be affecting conflict. Thiesen was estimating a model analogous to:
and reported finding no effect of either temperature or rainfall. I was looking through the replication code of the paper to check the structure of the fixed effects being used when I noticed something, the  squared terms for temperature and rainfall were offset by a constant so that the minimum of the squared terms did not occur at zero:

(Thiesen was using standardize temperature and rainfall measures, so they were both centered at zero). This offset was not apparent in the linear terms of these variables, which got us thinking about whether this matters. Often, when working with linear models, we get used to shifting variables around by a constant, usually out of convenience, and it doesn't matter much. But in non-linear models, adding a constant incorrectly can be dangerous.

After some scratching pen on paper, we realized that

for the squared term in temperature (C is a constant), which when squared gives:

because this constant was not added to the linear terms in the model, the actual regression Thiesen was running is:

which can be converted to the earlier intended equation by computing linear combinations of the regression coefficients (as indicated by the underbraces), but directly interpreting the beta-tilde coefficients as the linear and squared effects is not right--except for beta-tilde_2 which is unchanged. Weird, huh? If you add a constant prior squaring for only the measure that is squared, then the coefficient for that term is fine, but it messes up all the other coefficients in the model.  This didn't seem intuitive to us, which is part of why we drafted up the note.

To check this theory, we swapped out the T-tilde-squared measures for the correct T-squared measures and re-estimated the model in Theisen's original analysis. As predicted, the squared coefficients don't change, but the linear effects do:

This matters substantively, since the linear effect of temperature had appeared to be insignificant in the original analysis, leading Thiesen to conclude that Marshall and Ted might have drawn incorrect conclusions in their 2009 paper finding temperature affected conflict in Africa. But just removing the offending constant term revealed a large positive and significant linear effect of temperature in this new high resolution data set, agreeing with the earlier work. It turns out that if you compute the correct linear combination of coefficients from Thiesen's original regression (stuff above the brace for beta_1 above), you actually see the correct marginal effect of temperature (and it is significant).

The error was not at all obvious to us originally, and we guess that lots of folks make similar errors without realizing it. In particular, it's easy to show that a similar effect shows up if you estimate interaction effects incorrectly (after all, temperature-squared is just an interaction with itself).

Thiesen's construction of this new data set is an important contribution, and when we emailed this point to him he was very gracious in acknowledging the mistake. This comment didn't get seen widely because when we submitted it to the original journal that published the article, we received an email back from the editor stating that the "Journal of Peace Research does not publish research notes or commentaries."

This holiday season, don't let your friends drink and drive or add constants the wrong way in nonlinear models.

Monday, December 14, 2015

The right way to overfit

As the Heisman Trophy voters showed again, it is super easy to overfit a model. Sure, the SEC is good at playing football. But that doesn’t mean that the best player in their league is *always* the best player in the country. This year I don’t think it was even close.

At the same time, there are still plenty examples of overfitting in the scientific literature. Even as datasets become larger, this is still easy to do, since models often have more parameters than they used to. Most responsible modelers are pretty careful about presenting out-of-sample errors, but even that can get misleading when cross-validation techniques are used to select models, as opposed to just estimating errors.

Recently I saw a talk here by Trevor Hastie, a colleague at Stanford in statistics, which presented a technique he and Brad Efron have recently started using that seems more immune to overfitting. They call it spraygun, which doesn’t seem too intuitive a description to me. But who am I to question two of the giants of statistics.

Anyhow, a summary figure he presented is below. The x-axis shows the degree of model variance or overfitting, with high values in the left hand side, and the y-axis shows the error on a test dataset. In this case they’re trying to predict beer ratings from over 1M samples. (statistics students will know beer has always played an important role in statistics, since the origin of the “t-test”). The light red dots show the out-of-sample error for a traditional lasso-model fit to the full training data. The dark red dots show models fit to subsets of the data, which unsurprisingly tend to overfit sooner and have worse overall performance. But what’s interesting is that the average of the predictions from these overfit models do nearly the same as the model fit to the full data, until the tuning parameter is turned up enough that that full model overfits. At that point, the average of the models fit to the subset of data continues to perform well, with no notable increase in out-of-sample error. This means one doesn’t have to be too careful about optimizing the calibration stage. Instead just (over)fit a bunch of models and take the average.

This obviously relates to the superior performance of ensembles of process-based models, such as I discussed in a previous post about crop models. Even if individual models aren't very good, because they are overfit to their training data or for other reasons, the average model tends to be quite good. But in the world of empirical models, maybe we have also been too guilty of trying to find the ‘best’ model for a given application. This maybe makes sense if one is really interested in the coefficients of the model, for instance if you are obsessed with the question of causality. But often our interest in models, and even in identifying causality, is that we just want good out-of-sample prediction. And for causality, it is still possible to look at the distribution of parameter estimates across the individual models.

Hopefully for some future posts one of us can test this kind of approach on models we’ve discussed here in the past. For now, I just thought it was worth calling attention to. Chances are that when Trevor or Brad have a new technique, it’s worth paying attention to. Just like it’s worth paying attention to states west of Alabama if you want to see the best college football player in the country.

Monday, December 7, 2015

Warming makes people unhappy: evidence from a billion tweets (guest post by Patrick Baylis)

Everyone likes fresh air, sunshine, and pleasant temperatures. But how much do we like these things? And how much would we be willing to pay to gain more of them, or to prevent a decrease in the current amount that we get?

Clean air, sunny days, and moderate temperatures can all be thought of as environmental goods. If you're not an environmental economist, it may seem strange to think about different environmental conditions as "goods". But, if you believe that someone prefers more sunshine to less and would be willing to pay some cost for it, then a unit of sunshine really isn't conceptually much different from, say, a loaf of bread or a Playstation 4.

The tricky thing about environmental goods is that they're usually difficult to value. Most of them are what economists call nonmarket goods, meaning that we don't have an explicit market for them. So unlike a Playstation 4, I can't just go to the store and buy more sunshine or a nicer outdoor temperature (or maybe I can, but it's very, very expensive). This also makes it more challenging to study how much people value these goods. Still, there is a long tradition in economics of using various nonmarket valuation methods to study this kind of problem.

New data set: a billion tweets

Wednesday, December 2, 2015

Renewable energy is not as costly as some think

The other day Marshall and Sol took on Bjorn Lomborg for ignoring the benefits of curbing greenhouse gas emissions.  Indeed.  But Bjorn, among others, is also notorious for exaggerating costs.  That fact is that most serious estimates of reducing emissions are fairly low, and there is good reason to believe cost estimates are too high for the simple fact that analysts cannot measure or imagine all ways we might curb emissions.  Anything analysts cannot model translates into cost exaggeration.

Hawai`i is a good case in point.  Since moving to Hawai`i I've started digging into energy, in large part because the situation in Hawai`i is so interesting.  Here we make electricity mainly from oil, which is super expensive.  We are also rich in sun and wind.  Add these facts to Federal and state subsidies and it spells a remarkable energy revolution.  Actually, renewables are now cost effective even without subsidies.

In the video below Matthias Fripp, who I'm lucky to be working with now, explains how we can achieve 100% renewable energy by 2030 using current technology at a cost that is roughly comparable to our conventional coal and oil system. In all likelihood, with solar and battery costs continuing to fall, this goal could be achieved for a cost that's far less.  And all of this assumes zero subsidies.

One key ingredient:  We need to shift electric loads toward the supply of renewables, and we could probably do this with a combination of smart variable pricing and smart machines that could help us shift loads.  More electric cars could help, too.  I'm sure some could argue with some of the assumptions, but it's hard to see how this could be wildly unreasonable.

Monday, November 23, 2015

In cost-benefit calculation of climate action, Bjorn Lomborg forgets benefits

Sol and I had a letter to the editor in the Wall Street Journal today, responding to an earlier editorial by Bjorn Lomborg.  Below is what we wrote.  The best part is that I am "Prof. Marshall Burke" and Sol is "Solomon Hsiang, PhD" and we are both at Berkeley.  Apparently bylines are not the purview of the WSJ fact-checker (although Sol does have a PhD and is at Berkeley).


Bjorn Lomborg's "Gambling the World Economy on Climate" (op-ed, Nov. 17) argues that emissions reductions are bad investments because of costBut he never considers the value of the asset we are buying. Smart policy should carefully weigh the costs and benefits of possible actions and pursue those that yield the strongest return for society. Mr. Lomborg became famous advocating for this approach, but now he seems to forget his own lesson.
Our research shows that the climate is a valuable asset and paying billions to prevent it depreciating is a bargain. Our recent study published in Nature shows rising temperatures could cost 23% of global GDP by 2100 -- and that there is a 50-50 chance it could be worse. Mr. Lomborg rightly advocates for lifting up the world's poor, but we calculate that failing to address climate change will cost the poorest 40% of countries three-quarters of their income. By 2030 alone, we show that climate change could reduce annual global GDP by $5 trillion.
These are only the effects of temperature on productivity. Other impacts will add to the price tag. For example, we estimate avoiding intensification of tropical cyclones from climate change is worth about $10 trillion. And warming could increase conflict roughly 30% in 2050; what is that worth?
Mr. Lomborg says that $730 billion a year in 2030 is too much to pay to avoid many trillions in losses. This math is easy.
Prof. Marshall Burke
Solomon Hsiang, Ph.D.
University of California, Berkeley
Berkeley, Calif.