G-FEED

Friday, January 18, 2013

Blame it on the rain?

One of the things I’ve been puzzling over lately is why so many agronomists and others in agriculture seem to have the mantra that July rainfall makes the corn crop. These are generally smart people whose opinion I respect. For example, I’m a fan of Scott Irwin and Darrel Good’s blog over at Farmdoc daily, where they recently discussed the prospects for next July’s rainfall. They are definitely not the only ones to focus on rainfall. Whenever I present empirical work on weather and yield to a group of agronomists, at least one will invariably argue that the strong temperature effects are just an artifact of temperature tending to be high when rainfall is low. One problem with that argument is that a lot of the recent empirical work shows temperature as being important, and not all of it uses datasets with high correlations between temperature and rainfall.

Usually I try to explain the various mechanisms that link temperature to yields. And these discussions can often lead to interesting studies about which mechanisms matter more, such as one paper we have coming out soon. But I never really delved into the reason that rainfall is given so much credit for good or bad years. One likely reason is it’s just a lot easier to see whether or not it rains than to detect a shift in temperatures. So people will tend to remember dry years more than hot years. But there’s also some analysis that appears to support the rainfall hypothesis.

A lot of the work on US corn and weather traces back to Louis M. Thompson’s work 30 years ago. These were basically time series models at the state level, looking for instance at how Illinois yields changed over time in relation to weather. More recent work has updated this type of analysis, and emphasizes the role of July rainfall and temperature, but with a bigger role for rainfall. To recreate that type of analysis, I plot below detrended corn yields for Illinois (detrended by fitting a linear slope to represent gradual technology change, and then adjusting all years to 2006 technology) versus July rainfall (prec) and average maximum temperature (Tmax). I also plot prec and Tmax against each other. (Correlation coefficients are given in the bottom panels). Thanks to Wolfram for providing updated data.

You can see that yields are clearly low at low levels of rainfall, and that yields are also low at high Tmax. But you can also see that Tmax and rainfall have a strong negative correlation, which makes it hard to say whether Tmax, rainfall, or some combination (or neither) is the actual cause of yield loss. I also show three recent years in color (red = 2009, green = 2010, blue = 2011). What’s mildly interesting about these years is they don’t follow the normal correlation between Tmax and rainfall. 2009 was especially cool but with medium rainfall, and 2010 and 2011 were both unusually warm for the given amount of rainfall.

The main point, though, is that the colinearity issue cuts both ways. You can’t just decide it’s rainfall and say that temperatures are less important, no more than you can decide it’s all temperature. Many empirical studies try to move beyond these simple time series specifically because of the colinearity problem. One way to reduce colinearity between Tmax and rainfall is to restrict yourself to looking over a narrow range of one of the variables, so that it is essentially being held fixed while the other one is varying. This is hard to do with a time series that is about 50 years long in total, because you quickly run into problems with small sample sizes.

But it’s easier to do this if we look at time series from lots of counties at the same time, or a so-called panel analysis. As a simple illustration, the plot below shows all points for 1950-2011 for counties in the “three I” states: Illinois, Iowa, and Indiana. The left-hand plot just shows the same scatter as before between Tmax and rainfall. Notice there is still a strong negative correlation. But we can now select only points that are within a narrow range of rainfall (shown as green points) or a narrow range of temperature (shown as red). Then we can take the red points and see how rainfall matters when holding temperature constant (middle panel). Or we can take the green points and see how temperature matters when holding precipitation constant (right panel). The black lines in the right two panels show local polynomial fits to the data (using loess in R).

What do we see? Well, at least for these particular places and values of Tmax and rainfall, there does not appear to be much effect of changing rainfall when July Tmax is constant (at around 30C). But temperature changes do appear to be important when rainfall is held constant (at around 100mm).

Now, there are obviously lots of other combinations we could try, and the point isn’t that rainfall is always and completely unimportant. For example, if we hold Tmax constant at a higher level, where I’d expect rainfall to be more critical, we do in fact see a bigger effect of rainfall at low levels (see below). But if nothing else, it should be clear that a lot of the credit given to July rainfall for US corn is not necessarily well deserved. Coincidentally, the singers of “blame it on the rain” also got a lot of underserved credit!

In future posts, I’ll try to get more into the reasons that temperature can dominate the effects of rainfall, even in a rainfed system.

Wednesday, January 2, 2013

Some new - and not that hopeful - evidence on adaptation

Quantitative estimates of the potential impacts of future climate change are an important input to policy discussions, and researchers across a wide range of disciplines are starting to get in on the action. Most impact studies proceed something like this:

Choose an outcome and location of interest (say, corn yields in the US)
Assemble some historical data and investigate how this outcome has responded to past changes in climate in that location
Use these historical estimates to say something about what might happen in the future, given estimates of future changes in the climate variables you examined in (2).

A key assumption in this approach is that past responses to climate variables (step 2) tell you something about how future populations might respond to similar changes (step 3). But one problem is that there is often a mismatch between the types of climatic changes that are used to estimated responses in step 2, and the types of future changes in climate that we are particularly worried about in step 3.

For instance, the historical response of (say) corn yields to climate is often estimated using year-to-year variation in temperature (what we typically call "weather") -- e.g. these estimates ask, if temperatures at a given location were 1C hotter than average in a given year, how much lower were corn yields in that year? But the future climate changes that we're mainly worried about are gradual changes in temperature and precipitation that will play out over many decades. The worry is then that if farmers can recognize and adapt to these gradual changes, for instance by changing when they plant or switching the cultivar or crop they grow, and these adaptions offset some or all of the losses they otherwise would have experienced, then estimates of "short-run" responses to climate fluctuations might be a poor guide to the impacts of longer-run more gradual changes.

How can we know whether longer run responses to gradual changes in climate differ from shorter-run responses to climate fluctuations? In a new working paper, Kyle Emerick and I try to shed some light on this by exploiting the surprisingly large variation in recent longer-run trends in temperature and precipitation across US agricultural areas. It turns out that some US counties have warmed up a bunch over the last 30 years, and others have actually cooled slightly. Why this is has happened is an active area of climate research, but appears tied to some combination of aerosols and natural climate variability (e.g. see new paper by Meehl et al, or google "U.S. warming hole" for more on this - probably better to have safe search on…).

Below is a plot of changes in average growing season temperature and precipitation between 1980-2000 for US counties east of the 100th meridian. The color scale for each map is given by the histogram beneath, and as you can see, some places have cooled almost half a degree C, others warmed 1.5C, and some counties have seen precip fall by 40% and others rise by 40%.

Change in average growing season temperature, precipitation, and corn yield between 1980 and 2000.

What we do in the paper is estimate how yields of the main US crops (corn and soy) have responded to these gradual, multi-decade changes in climate. We can then compare these longer run responses to estimates of shorter-run response to inter-annual fluctuations in temperature and precipitation. The difference between these two responses is our quantitative estimate of "adaptation". If farmers indeed have a lot of options available to them in the long run that are not available in the short run, then we would expect exposure to hot temperatures to be much less damaging in the long run than in the short run.

We find very little evidence for longer-run adaptation to climate: yield responses to longer-run increases in temperature are large, negative, and statistically indistinguishable from responses to shorter-run (annual) fluctuations in temperature. This appears true for both corn and soy, and doesn't seem to depend much on the period over which we measure the changes or how we do the econometrics.

From a future impact perspective this is bad news, because it would imply that farmers do not appear to currently have a lot of options for mitigating damages to extreme heat exposure. Indeed, we can generate impact estimates by combining our estimates of these recent longer-run responses with projections of future climate change, and our median projection is a roughly 15% decline in corn yields by 2050 -- almost on par with yield declines during the well-publicized 2012 heat wave/drought across the US.

Nevertheless, there are some potential concerns with our approach and with our impact estimates. One is that we are focusing on yield impacts, and it could be the case that farmers are making other adjustments that a narrow focus on yield will not pick up -- for instance they could be switching to other crops that we are not looking at, or they could be getting out of agriculture altogether, planting golf courses instead of corn. Were this the case, though, we would expect to see substantial declines in area planted to corn in the counties that warmed dramatically, and we find little evidence that this is the case.

Another concern is that maybe farmers have simply not realized that temperatures are changing, and that if future changes are particularly large or salient, they will be recognized and quickly responded to. We have a couple responses to this. The first is that recognition is an important part of adaptation: because warming over the past few decades in many counties is roughly on par with warming expected over the next 3-4 decades, it's not immediately clear how near-term future changes in climate will somehow be way more obvious than past changes. Second, we show in the paper that some factors that you might think could be correlated with recognition of or belief in recent climate change do not predict how counties responded to recent warming. In particular, neither counties that are better educated (who might have better access to information on climate change, be loyal readers of G-FEED, etc) nor counties who voted more Democratic in the 2000 Presidential election (who studies show are more likely to believe in climate change) were any less harmed by recent warming than less educated or more Republican counties. These are obviously not perfect tests of "recognition", but were the best we could think of (other suggestions welcome!).

A final concern is that we are again relying on historical relationships to make claims about the future impacts. That is, although we now use historical longer-run responses (instead of short-run responses) to predict impacts under future climate changes, we are still assuming that past responses to climate variables are a guide to how future populations will respond. While in principle you could arbitrarily specify any relationship between past response and future response that you wanted to (e.g. future responses to 1C will be half as large as past responses), our "business-as-usual" assumption seems the most relevant to the policy question at hand: what's a reasonable estimate of what might happen in the absence of new investments in adaptation. To us, our findings of substantial losses in the face of past long-run changes in climate suggest that it would be dangerous to assume that farmers will easily adapt to future changes. More likely, adaptation is going to require investment, and without these investments the summer of 2012 is not a bad picture of what the "new normal" might look like.

Friday, December 21, 2012

The good and bad of fixed effects

If you ever want to scare an economist, the two words "omitted variable" will usually do the trick. I was not trained in an economics department, but I can imagine they drill it into you from the first day. It’s an interesting contrast to statistics, where I have much of my training, where the focus is much more on out-of-sample prediction skill. In economics, showing causality is often the name of the game, and it’s very important to make sure a relationship is not driven by a “latent” variable. Omitted variables can still be important for out-of-sample skill, but only if their relationships with the model variables change over space or time.

A common way to deal with omitted variable bias is to introduce dummy variables for space or time units. These “fixed effects” greatly reduce (but do not completely eliminate) the chance that a relationship is driven by an omitted variable. Fixed effects are very popular, and some economists seem to like to introduce them to the maximum extent possible. But as any economist can tell you (another lesson on day one?), there are no free lunches. In this case, the cost of reducing omitted variable problems is that you throw away a lot of the signal in the data.

Consider a bad analogy (bad analogies happen to be my specialty). Let’s say you wanted to know whether being taller caused you to get paid more. You could simply look at everyone’s height and income, and see if there was a significant correlation. But someone could plausibly argue that omitted variables related to height are actually causing the income variation. Maybe very young and old people tend to get paid less, and happen to be shorter. And women get paid less and tend to be shorter. And certain ethnicities might tend to be discriminated against, and also be shorter. And maybe living in a certain state that has good water makes you both taller and smarter, and being smarter is the real reason you earn more. And on and on and on we could go. A reasonable response would be to introduce dummy variables for all of these factors (gender, age, ethnicity, location). Then you’d be looking at whether people who are taller than average given their age, sex, ethnicity, and location get paid more than an average person of that age, sex, ethnicity, and location.

In other words, you end up comparing much smaller changes than if you were to look at the entire range of data. This helps calm the person grumbling about omitted variables (at least until they think of another one), and would probably be ok in the example, since all of these things can be measured very precisely. But think about what would happen if we only could measure age and income with 10% error. Taking out the fixed effects means removing a lot of the signal but not any of the noise, which means in statistical terms that the power of the analysis goes down.

Now to a more relevant example. (Sorry, this is where things may get a little wonkish, as Krugman would say). I was recently looking at some data that colleagues at Stanford and I are analyzing on weather and nutritional outcomes for district level data in India. As in most developing countries, the weather data in India are far from perfect. And as in most regression studies, we are worried about omitted variables. So what is the right level of fixed effects to include? Inspired by a table in a recent paper by some eminent economists (including a couple who have been rumored to blog on G-FEED once in a while), I calculated the standard deviation of residuals from regressions on different levels of fixed effects. The 2^nd and 3^rd columns in the table below show the results for summer (June-September) average temperatures (T) and rainfall (P). Units are not important for the point, so I’ve left them out:

	sd(T)	sd(P)	Cor(T1,T2)	Cor(P1,P2)
No FE	3.89	8.50	0.92	0.28
Year FE	3.89	4.66	0.93	0.45
Year + State FE	2.20	2.18	0.84	0.26
Year + District FE	0.30	1.63	0.33	0.22

The different rows here correspond to the raw data (no fixed effect), after removing year fixed effects (FE), year + state FE, and year + district FE. Note how including year FE reduces P variation but not T, which indicates that most of the T variation comes from spatial differences, whereas a lot of the P variation comes from year-to-year swings that are common to all areas. Both get further reduced when introducing state FE, but there’s still a good amount of variation left. But when going to district FE, the variation in T gets cut by nearly a factor of 10, from 2.2 to 0.30! That means the typical temperature deviation a regression model would be working with is less than a third of a degree Celsius.

None of this is too interesting, but the 4^th and 5^th columns are where things get more related to the point about signal to noise. There I’m computing the correlation between two different datasets of T or P (details of which ones are not important). When there is a low correlation between two datasets that are supposed to be measuring the same thing, that’s a good indication that measurement error is a problem. So I’m using this correlation here as an indication of where fixed effects may really cause a problem with signal to noise.

Two things to note. First is that precipitation data seems to have a lot of measurement issues even before taking any fixed effects. Second is that temperature seems ok, at least until state fixed-effects are introduced (a correlation of 0.842 indicates some measurement error, but still more signal than noise). But when district effects are introduced, the correlation plummets by more than half.

The take-home here is that fixed effects may be valuable, even indispensible, for empirical research. But like turkey at thanksgiving, or presents at Christmas, more of a good thing is not always better.

UPDATE: If you made it to the end of this post, you are probably nerdy enough to enjoy this related cartoon in this week's Economist.

Wednesday, December 12, 2012

What poop tells us about the social impacts of climate change

A growing literature in paleoclimate and archeology explores the extent to which past fluctuations in climate have shaped the evolution of human societies. These papers get to tackle pretty sexy topics: did climate help cause the collapse of the Maya? Is climate implicated in dynastic transitions in China? How about in the fall of Angkor Wat?

That a lot of these papers are answering "yes" to the question of whether climate is implicated in large historical social upheavals could tell us something important about the impact of future climatic changes on social outcomes. But there are a few things you might worry about in this literature. One is that the studies are actually measuring what they say they're measuring -- i.e that they're picking up meaningful variation in human activity, and that changes in societies and in climate happened when they say they did. Most of the papers published that you see on these topics spend most of their time convincing you that this is the case, and given my mere hobbyist's understanding of paleolimnology I have to take them at their word.

The second concern is one that is more familiar to folks that are used to running regressions: can we say with certainty that the variations in climate are causally linked to the socioeconomic variation of interest? The hard part with these papers is that they're often dealing with one-off events -- e.g. the collapse of the Maya -- that don't give you the repeated observations you need to carry out the typical statistical tests. Basically, you'd be worried that even though the collapse event you measured was coincident with a large climate shock, by chance something unobserved might also have happened at the same time that in fact caused the collapse. Given this, you might be worried that these studies are looking under the proverbial lamppost for the proverbial keys: we'd like to observe the universe of all climate events and all collapse events over time, but instead we focus on a few iconic ones.

A new paper in PNAS helps overcome some of these concerns. D'Anjou and coauthors use coprostanol concentrations (Wikipedia: chemical compounds found in fecal matter of higher order mammals - i.e. poop) that they dug up in a Norwegian lake to estimate the variation in local human activity in the nearby area over the last 2000 or so years. They then compare this to existing reconstructions of local summertime temperature, which is the time of year when agriculture would have been possible. The nice thing about their paper is that they have a lot of observations of the same place over time, and so can run some of the basic statistical tests you often want to see in these papers (and can't). The other nice thing is that poop appears to be a much better indicator of human activity that many of the proxies used in the past, which could have been directly affected by climate (e.g. charcoal from fires, which could have been manmade or could have risen naturally as temperatures changed).

Here is the money plot comparing poop and temperature (their Figure 5):

While there are a couple things you can still complain about -- e.g. you probably want to see Panel C as a plot of the time-detrended data -- this to me is one of the more convincing relationships that has shown up in these Paleo papers. As in other studies looking at cold regions, they show that human activity responded strongly and positively to warmer temperatures: drops of ~4C caused total abandonment of (poop-related) human activity in the region.

While both the broader welfare effects and the modern implications of this and related studies are not immediately obvious (did people die or just migrate south? what do Iron Age societies' sensitivities to climate imply for modern societies?), the methodological differences between this and most of the past studies is to me a nice contribution. And hopefully the grad students who had to dig up the poop got a PNAS paper out of the deal...

Tuesday, December 11, 2012

The summer of 2013

Last week at AGU I gave a talk about the lessons of the US corn harvest in 2011 and 2012, both of which were below trend line (see figure). That got me thinking a little more about what to look for in 2013. The obvious point is that it is likely to be better than 2012, because it can’t get much worse. But that’s not too insightful, it’s like saying that Cal’s football team will be better next year, since they were so bad this year (By the way, welcome to Max Aufhammer, our newest blogger! With Wolfram’s move to Berkeley that brings our Cal contingent up to 3. I sure hope I don’t say anything to offend them.)

As we’ve talked about in other posts, the summer of 2012 might be considered the normal in a few decades, but not now. And some recent work from Justin Sheffield and colleagues in Nature argues that drought trends globally, and in North America, are not significantly positive if calculated properly (which contradicts some earlier work). We can leave aside for now the question of whether soil moisture trends are the best measure of drought exposure if one cares about corn yields (though a good topic for a future post), and simply say that conditions in 2012 were well below trend.

This means we’d expect next year to be closer to the trend, and that seems to be the overriding sentiment of markets. As Darrell Good over at farmdoc daily explains “In the past five decades, extreme drought conditions in the U.S., like those experienced in 2012, have been followed by generally favorable growing conditions and yields near trend values.”

But two things work against this tendency to revert to the mean. First, the drought still persists throughout much of the country, as seen at UNL’s drought monitor site. As Good goes on to say, “current dry soil moisture conditions in much of the U.S. and some recent forecasts that drought conditions could persist well into next year have raised concerns that such a rebound in yields may not occur in 2013.” In other words, if the Corn Belt does not get a wet winter and/or spring, expect prices to start climbing again.

Second, though, is that good initial moisture does not eliminate the chance of drought during the season. There’s an interesting piece by folks at the National Climate Data Center (NCDC) in the AGU newsletter I got today (it was actually published Nov. 20, but it takes about 3 weeks for me to get it!). They note that the 2012 was not like previous droughts in the 1930’s and 1950’s, or even 1988, in that it was very much driven by high temperatures rather than low starting moisture. As they say:

“For example, at the end of February in both 2011 and 2012 the national PDSI (calculated using the observed monthly mean temperature and precipitation averaged across the contiguous United States) was 1.2 (mildly wet) and –2.5 (moderate drought), respectively, compared to 1934 and 1954 of –5.7 and –4.6, respectively.”

This is also shown pretty effectively in an animation by climate central. So the high temperatures in recent years have made drought come on much more quickly than usual. As the NCDC piece says “By the end of September, every month since June 2011 had above normal average temperatures, a record that is unprecedented.” That's 16 straight months of above normal temperatures!

So my seat-of-the-pants guess is that next year’s yields are likely to still be below trend line (which would be at around 160 bushels/acre). Obviously lots of things could push it above trend line (including changing the definition of the trend!), and it’s way too early to have much confidence about how 2013 will end up. But following Sol’s lead on the Sandy damage prediction, I’ll go out on a limb (his mean was too low by a factor of two, but the true damage was within the confidence interval!). And to pair a risky bet with a safe one, I’ll also predict Stanford wins big at the Rose Bowl.

Sunday, December 9, 2012

Climate, food prices, social conflict and....Google Hangout?

My coauthor Kyle Meng was asked to participate in this HuffPost Live discussion about climate, food prices and civil conflict. It's an interesting discussion, which gets pretty rowdy at times, with an eclectic group. I am also very impressed by HP's leveraging of Google Hangout to produce a low-cost public, intellectual forum.

David has written about the food-price and conflict linkage before, and we've discussed the association between climate and conflict a few times here. In general, I don't think a linkage has been demonstrated conclusively with data, but that doesn't seem to get in the way of people referencing it.

The debate is interesting and entertaining, highlighting a few of the differences in how some policy-folk, economists and ecologists view theses various ideas.

Kyle was asked to participate because he was an author of our 2011 Nature paper on ENSO and conflict. He also happens to be on the job market right now.

Thursday, December 6, 2012

Climate data and projections at your fingertips

Do you ever get jealous of Wolfram's pretty graphs on this blog or just want to know what March rainfall will look like in New Zealand at midcentury -- but you just don't have the time or energy to sort through all the various climate data sets or learn how to use GIS software?

Lucky for you, the Nature Conservancy has teamed up with scientists at the University of Washington and the University of Southern Mississippi to develop Climate Wizard, a graphical user interface available through your browser window that lets you surf real climate model projections and historical data for both the USA and the world. According to the website:

With ClimateWizard you can:

view historic temperature and rainfall maps for anywhere in the world

view state-of-the-art future predictions of temperature and rainfall around the world

view and download climate change maps in a few easy steps

ClimateWizard enables technical and non-technical audiences alike to access leading climate change information and visualize the impacts anywhere on Earth. The first generation of this web-based program allows the user to choose a state or country and both assess how climate has changed over time and to project what future changes are predicted to occur in a given area. ClimateWizard represents the first time ever the full range of climate history and impacts for a landscape have been brought together in a user-friendly format.

The data sets underlying behind the pictures are well documented on the "about us" page, and the data in each map is easily exportable.

If this had come out four years ago, I probably could have shaved six months off of my phd...

h/t Bob Kopp

Sunday, November 25, 2012

Apps or agriculture?

In one of the most emailed NYTimes articles from last week, it was reported that there are now more software engineers in the US (about a million) than there are farmers. A follow-up article on the "Silicon Prairie" suggests that a growing percentage of these software jobs are, ahem, cropping up in farming strongholds like Iowa and Kansas.

The articles point out that while building iPhone apps nobody wants isn't particularly lucrative, building apps that people do want can be (e.g. the guys who thought to launch birds at pigs pulled in over $100 million in revenue in 2011). And on the whole, getting a job as a software engineer appears to pay pretty well, at least judging by what private shuttle bus access to Silicon Valley has done to rents in the Mission neighborhood in SF where I live.

This transformation from agriculture to apps is what economists call the "structural transformation" of economies, in which they metamorphose from poorer, rural, and agrarian societies to societies that are rich(er), more urbanized, and more devoted to industry and services. You start out with everyone poor and farming, and you end with everyone wealthy and making apps for each other.

Understanding why and how this transition occurs is a central (and old) question in development economics, and one that remains crucial for economic policy in the developing world. A key debate is to what extent improvements in agricultural productivity help contribute to broader economic development. That is, do poor countries undergo the structural transformation because they have improved the productivity of their (majority) rural population, or do they undergo it because they have focused their attention and investments on other, seemingly more modern sectors?

Agricultural development and economic growth are clearly related. Below are some plots for African countries (left) and all countries (right) over the period 1961-2008 using annual data from the World Bank, with overall GDP growth on the y-axis and agricultural growth on the x-axis. On one level this relationship is mechanical: in countries where agriculture makes up a large percentage of total output, it has to be the case that growth in agricultural productivity increases overall output and makes people better off. But because agriculture is a relatively small share of total output in most countries outside of Africa, and the relationship is still relatively strong and highly significant using all countries in the world (right plot), there is probably something more interesting going on. But which way does the causal arrow go? Does an improving economy lead to agricultural growth, or is agriculture itself an important engine of that growth?

There are a few ways that agricultural growth could spur growth in other sectors. First, a more productive agriculture could provide stuff that other sectors need to grow: an agricultural surplus generates capital that can be invested in non-agricultural enterprises and frees up labor to work in them. Second, increased agricultural productivity means better-off farmers which means increased demand for non-agricultural stuff: rich(er) people spend a smaller and smaller percentage of their income on food, and have extra money to spend on a new widget or that Angry Birds app. Agricultural growth could harm other sectors too, however, by raising their costs for land and labor: increases in ag productivity will likely raise wages and land prices for everyone. Furthermore, if capital and labor are mobile, maybe non-agricultural sectors can get what they need from elsewhere.

A couple recent papers try to shed some causal relationship between agricultural development and broader economic performance. The basic empirical approach is familiar: find something that shifts around agricultural productivity in a plausibly random way, and see what this implies for economic performance outside agriculture. Finding that magical "exogenous" shifter is where the cleverness come in.

In a new working paper, Hornbeck and Keskin study what happens when changes in technology (better pumps) suddenly allowed farmers in parts of the US Great Plains to tap the Ogallala Aquifer for irrigation. They show that access to the Ogallala greatly increased agricultural productivity in the counties where it could be accessed, and they then compare how the non-agricultural economy performed in these counties to its performance in nearby counties that were otherwise the same but that could not access the Aquifer. They show that agricultural productivity gains do not seem to have helped the long-run performance of either industry or services in these counties.

Hornbeck and Keskin conclude that "a large windfall gain in the agricultural sector does not appear to have encouraged broader economic development of the local non-agricultural economy", and that "public support of the agricultural sector does not appear to generate positive economic spillovers that might justify its distortionary impacts." Citing other work by Hornbeck and co-authors , they note that the opening of large manufacturing plants in the US appeared to have much larger local spillovers.

Call this support for the "apps not agriculture" view: if you want to improve outcomes, invest in something besides agriculture. You hear this a lot. Here, for instance, is econo-celebrity blogger Chris Blattman a couple weeks ago: "Helping poor women set up a market stall, or small farmers double their profits, is a good and noble goal.... But it is a humanitarian strategy, not a growth strategy. Real economic change will wait for industry."

So, apps not agriculture? A paper by Nunn and Qian published last year in the QJE, a top econ journal, paints a very different picture. They study what happened when European sailors brought potatoes back from the Americas to the "Old World" (i.e. Eastern Hemisphere) beginning in the 1500 and 1600s. They show that relative to what was being grown in much of the Old World at the time, growing potatoes offered a remarkable bang for the buck in terms of both calories and vitamins. Their empirical approach, similar to Hornbeck and Keskin, compares how outcomes evolved in Old World areas that were able to adopt the potato to how they evolved in nearby areas that were not suitable to potato cultivation.

They find that potato-related increases in agricultural productivity explain a remarkable 25-30% of the (dramatic) increase in population and urbanization that these areas experienced between 1700-1900. That is, agricultural productivity improvements played a driving role in the ensuing structural transformation of these economies. This finding suggests a very different policy prescription: if you want to set the structural transformation in motion, you would do well to invest in agricultural productivity improvements. Call this the "from agriculture grows apps" view.

Which view is correct? And what explains the difference between the findings in these two papers? It seems to me that the Hornbeck and Keskin finding describes an economic setting (post WWII United States) that is very different from many rural agricultural settings of today's poor countries: trade and capital markets in the US were/are fairly well integrated, which would limit local demand spillovers (just buy that widget you want off Amazon) and limit the need for capital to come from local sources (borrow from Chase instead of your buddy down the street).

The setting in Nunn and Qian is perhaps more relevant for many of today's developing countries -- and in fact includes a lot of them -- but much of their data and anecdotes are from Europe and so the implications for, say, Africa are not automatic. Some equally rigorous studies on modern developing countries are going to be key for understanding how we get from agriculture to apps -- in particular, in understanding whether the insights from papers like Nunn and Qian hold for an increasingly globalized world. As in the two papers above, this is going to require some serious cleverness in identifying exogenous shifters of longer-term agricultural productivity.

(Concluding sidenote: much like March Madness, apps can have their own decidedly negative economic spillovers...)

Monday, November 12, 2012

Grow, Canada?

A quick post on an interesting article in Bloomberg last week about expansion of corn in Canada. I’ve been keeping an eye out for stories about possible adaptations in the wake of this summer’s poor corn harvest in the U.S (or as Stephen Colbert called it – our shucking disaster). As we’ve discussed before on this blog, having crops migrate northward is a commonly cited adaptation response. In addition to being encouraged by the warming trends, the northward migration of corn could be helped by new varieties that have shorter cycles and better cold hardiness.

It’s certainly interesting to see the expansion of corn in areas like Alberta and Manitoba, or for that matter in North Dakota. But too often media stories, or even scientific studies, present the changes as de facto proof of adaptation. The question is not really if crops will move around – they always have moved around in the past, and will continue doing so in the future. And I don’t even think the key question is whether climate change will be an important factor in them moving around – the evidence is still slim on this question but the Canada transition is clearly one made easier by the climate trends. Instead, the most important issue is how much is gained by these adaptive moves relative to the overall impacts of climate trends.

Sol has been analyzing this in some detail for different regions around the world, so he will hopefully have some more to say with numbers to back it up. But let me just point to two relevant questions that were each alluded to in the Bloomberg article. First, is the scale of expansion large relative to the main zones of production? In the case of Canadian corn, the article mentions about 120,000 Ha of corn area sown in three provinces of Canada (Manitoba, Saskatchewan, and Alberta). That is less than one-half of one percent of the U.S. corn area.

Second, what is being displaced by the crop expansion? In most cases, including Canada’s, corn is being grown by farmers that used to grow wheat or barley. So the gain in corn production is offset to a large degree by the loss in wheat production (although not completely, since corn typically produces more grain per hectare than wheat). Modeling studies of adaptation typically assume there is a net expansion of total cropland in cold areas, not just expansion of individual crops. Without net area expansion, it is hard to offset losses incurred at lower latitudes.

There is certainly an argument to be made that big expansions of net area will only be seen with more substantial shifts in climate, since only then will it pay to make the large capital investments to open up new areas for agriculture. But it’s also possible that the constraints on expanding into new areas (poor soils, lack of infrastructure, property rights, rules on foreign investment) are large enough that only a modest amount of expansion will happen.

The main point is that changes in crop area have to be judged not just by whether or not they happen, but by whether their impact is large enough to matter. For most readers of Bloomberg, “large enough to matter” may simply mean that it’s an opportunity to make a lot of money. But for those of us interested in global food supply and price dynamics, the scale of interest is much larger. A 25% drop in U.S. corn production leaves a big hole to fill – it’s not enough to drop a few shovels full of dirt in and call it a day.

Friday, November 9, 2012

Climate and Conflict in East Africa

Andrew Revkin asked what I thought about this recent PNAS article:

Climate variability and conflict risk in East Africa, 1990–2009

John O’Loughlina, Frank D. W. Witmer, Andrew M. Linke, Arlene Laing, Andrew Gettelman, and Jimy Dudhia

Abstract Recent studies concerning the possible relationship between climate trends and the risks of violent conflict have yielded contradictory results, partly because of choices of conflict measures and modeling design. In this study, we examine climate–conflict relationships using a geographically disaggregated approach. We consider the effects of climate change to be both local and national in character, and we use a conflict database that contains 16,359 individual geolocated violent events for East Africa from 1990 to 2009. Unlike previous studies that relied exclusively on political and economic controls, we analyze the many geographical factors that have been shown to be important in understanding the distribution and causes of violence while also considering yearly and country fixed effects. For our main climate indicators at gridded 1° resolution (∼100 km), wetter deviations from the precipitation norms decrease the risk of violence, whereas drier and normal periods show no effects. The relationship between temperature and conflict shows that much warmer than normal temperatures raise the risk of violence, whereas average and cooler temperatures have no effect. These precipitation and temperature effects are statistically significant but have modest influence in terms of predictive power in a model with political, economic, and physical geographic predictors. Large variations in the climate–conflict relationships are evident between the nine countries of the study region and across time periods.

Here is my full reply:

This is useful paper, and the results are important, but the framing by the authors and the press coverage (which I suppose that the author's guide) is strange.

The authors report that months with hotter and drier conditions have much more violence. And the size of this effect is *large*: rates of violence climb 29.6% when temperatures are 2 degrees higher than normal, and 30.3% when rainfall declines from wet to normal/dry (a 2 standard deviation change). These numbers are really big! To get a sense of scale, shifting from a violent hot/dry anomaly to a more peaceful cool/wet anomaly decreases violence by more than 50%, which is similar to the reduction in crime that NYC felt during Rudy Guiliani's tenure! (see here). Whether or not you think Guiliani was responsible for that decline, it is widely recognized that the reduction in NYC violence on that scale had a large effect on the welfare of New Yorkers.

Now, while these effects are large, they are not new. Our 2011 Nature paper found almost the identical result. And the 2009 PNAS paper by Burke et al. also reported the same sized effects. Burke et al. was at the national scale, and our paper was at the global scale, so I think that the main contribution of O'Loughlin et al is to demonstrate that the findings of those two earlier papers continue to hold up a the local scale.

What I find surprising about the paper's presentation and the press coverage is that it looks like the authors are trying to bury this finding within the paper by suggesting that these effects are small or unimportant. I don't know why (but it certainly has the feel of Nils Petter Gleditsch's attempt to bury similar findings in his 2012 Special Issue of the Journal of Peace Research here). Had I had found this results, I would be putting them front and center in the article rather than reporting them in the text on page 3.

The main way that these authors try to downplay their findings is to argue that temperature and precip anomalies don't have a lot of predictive power compared to "other variables", but this is a red herring. The comparison the authors make is not an apples-to-apples comparison. The statistical model the team uses has two dimensions, time and space. In modeling violence, the team first tries to model *where* violence will occur to get a location-specific baseline. Then, conditional on a location's baseline level of violence, they model *when* violence in a specific location is higher or lower than this baseline. Their main finding has to do with the "when" part of the question, showing that violence within a specific location fluctuates in time, reflecting temperature and precip anomalies. But then they go on to compare whether they can predict the timing or location of violence better, which is not a useful exercise. They conclude that location variables like "population density" or "capital city" are much stronger predictors of violence than timing variables like "temperature" or "presidential election", but spatial variation and temporal variation in violence are completely different in magnitude and dynamics, so it is unclear what this comparison tells us. The authors argue that this tells them that doublings of violence brought on by hot/dry conditions is only a "modest" effect, but this claim doesn't have any statistical foundation and doesn't jibe with common sense.

To see why, consider the NYC/Guiliani example. If we ran the O'Loughlin et al. model for New York State during 1980-2010 with variables describing "population density" and a trend (to capture the decline in the 1990s), we'd see that on average higher population densities would have higher crime (i.e. NYC has more crime than small towns in Upstate New York) and there is a fall in violence by 50% in the 1990s. But if one used the same measures as O'Loughlin to ask whether location or trend was "more important", we'd find that location is a much more "important" predictor of violence because the difference between violence in NYC and the rest of the state is much more dramatic than the 50% decline within NYC during the 1990s. Using this logic, we would conclude that the 50% decline in the 1990's was "modest" or "not important," but anyone who's been living in NYC for the last couple of decades will say that conclusion is nuts. Halving violence is a big deal. One reason this statement doesn't make a lot of sense is because many rural locations have very low crime rates (in part because they don't have many people) so violence could double, triple or increase by a factor of ten and those changes would be trivial (in terms of total violent events) compared to a 50% change in NYC. The fallacy of these kinds of apples-to-oranges comparisons (mixing "where" with "when" variables) is why using "goodness of fit" statistics to assess "importance" doesn't make sense when working with data that covers both time and space. An aside: this mistake also shows up as one of the critical errors make by Buhaug in his 2010 PNAS article that got a lot of press and is widely misunderstood and misinterpreted.

So in short: The paper is important and useful because the authors confirm at the local-scale previously discovered large-scale results linking climatic events and violence. But their conclusion that these effects are "modest" is based on an incorrect interpretation of goodness-of-fit statistics.

In a follow-up email, I sent him a graph that he ended up posting. This is the full explanation:

Marshall Burke and I were looking at the data from this paper and put together this plot (attached). It displays the main result from the paper, but more clearly than any of the figures or language in the paper does (we think). Basically, relatively small changes in temperature lead to really large changes in violence, illustrating my earlier point that this study finds a very large effect of climate on violence in East Africa.

click to enlarge

About the plot:
The thin white line is the average response of violence to temperature fluctuations after rainfall, location, season and trend effects are all removed. The red "ink" indicates the probability that the true regression line is at a given location, with more ink reflecting a higher probability (more certainty). There's a fixed amount of "ink" at each temperature value, it just appears lighter if the range of possible values is more spread out (less certain). We estimate the amount of uncertainty in the regression by randomly resampling the data 500 times, re-estimating the regression each time and looking at how much the results change (so-called "bootstrapping"). This "watercolor regression" is a new way of displaying uncertainty in these kinds of results, I describe it in more detail here. A similar and related plot showing the relationship between rape and temperature is here.

Pages