Thursday, May 22, 2014

Regressing to the mean

As a general rule, people like to take credit when good things happen to them, but blame bad luck when things go wrong. This probably helps us all get through the day, feeling better about ourselves and other people we like. For example, I’d like to think that the paper rejection I got last month was bad luck, while the award I got last year was totally deserved. But anyone who pays attention to professional sports, or the stock market, knows that success in any single day or even year has a lot to do with luck. It’s not that luck is all you need, but it often makes the difference between very evenly-matched competitors. That’s why people or teams who perform particularly well in one year tend to drop back the next, a.k.a. regressing to the mean.

Take tennis. There’s clearly a skill separation between professionals and amateurs, and between the top four or five professionals and everyone else. But among the top, it’s hard to know who will win on any day, and it’s very hard to sustain a streak of victories against other top players. So even when someone like Raphael Nadal, who has some remarkable winning streaks, talks about how he got a few key bounces, he’s as much being an astute observer as a gracious winner.

Or the stockmarket. It’s well known that even the best fund managers have a hard time outperforming the market for a long time. Even if someone has beaten it five years in a row, there’s a good chance it was just luck given how many fund managers are out there. I don’t tend to watch interviews with fund managers as much as athletes, but something tells me they might be a little less inclined to turn down the credit.

So what about agriculture? It’s not a source of entertainment or income for most of us, so we don’t spend much time thinking about it, and you won’t find any posts on fivethirtyeight about it. But one thing that struck me early on working in agriculture is how farmers are just as prone to thinking they are above average as the rest of us. More specifically, if you ask why some fields around them don’t look as good, they will talk about how that farmer is lazy, has another job, doesn’t take care of his soil, etc.

As far as I can tell, this isn’t a purely academic question. Understanding how much farmers vary in their ability to consistently produce a good crop is important if you want to know the best ways of improving agricultural yields. If there really are a bunch of star performers out there, then letting them take over from the laggards by buying up their land, or training other farmers to use best-practices, could be a good source of growth in the next decade. For example, here’s a cool presentation about a new effort in India to have farmers spread videos of best practices through social networks.

There’s a fairly obvious but not perfect link to the idea of yield gaps. People who say that closing yield gaps are a big “low-hanging fruit” for yield improvement often have a vision of better agronomy being able to drastically raise yields. The link isn’t perfect, because it could be that even the best farmers in a region are underperforming, agronomically speaking, for instance if fertilizers are very expensive. And it could be that some farmers consistently out-perform not because of better management, but because they are endowed with better soils (though this can be sorted out partly by using data on soil properties). Even with these caveats, understanding how much truly better the “best” farmers are could help give a more realistic view of what could be achieved with improved agronomy.

The key here is the “how much” part of it. Nobody can argue that some farmers aren’t better than others, just as nobody can say that Warren Buffet isn’t better than an average fund manager. The question is whether this is a big opportunity, or if it’s best to focus efforts elsewhere. I’ve been trying to get a handle on the “how much” over the years by using satellites to track yields over time. I’ve used various ways of trying to display this in a simple way, but nothing was too satisfying. So let me give it another shot, based on some suggestions from Tony Fischer during a visit to Canberra.

The figures below show wheat yield anomalies (relative to average) for fields ranked in the top 10% for the first year we have satellite data (green line). Each panel is for a different area, and soils don’t vary too much within each area. Then we track those fields over the following years to see if those fields are able to consistently outperform their neighbors. Similarly we can follow fields in any other group, and I show both the bottom decile (0-10% in blue) and the fifth (40-50% in orange). The two horizontal lines show the mean yield anomaly in the first group for the year they ranked in the top, and their mean yield anomaly in all the other years. If it was all skill (or something else related to a place like the soil quality) the second line would be on top of the first. If it was all luck, the second line would be at zero.


So what’s the verdict? The top performers in the first year definitely show signs of regressing to the mean, as their mean yield drops much closer to the overall average in other years. Similarly, the worst performers “regress” back up toward the mean. But neither jump all the way back to zero, which says that some of the yield differences are persistent. In the two left panels, the anomalies are a little more than one-third the size they were in the initial year. In the right panel, the anomalies are about half their original value, indicating relatively more persistence. That makes sense since we know the right panel is a region where some farmers consistently sow too late (see this paper for more detail).

So a naive look at any snapshot of performance over a year or two would be way too optimistic about how big the exploitable yield gap might be. It’s important to remember that performance tends to regress to the mean. At the same time, there are some consistent differences that amount to roughly 10% of average yields in these regions (where mean yield is around 5-6 tons/ha). And with satellites we can pinpoint where the laggards are and target studies on what might be constraining them. At least that's the idea behind some of our current work. 


Whether other regions would look much different than the 3 examples above, I really don’t know. But it shouldn’t be too hard to find out. With the current and upcoming group of satellites, we now have the ability to track performance on individual fields in an objective way, which should serve as a useful reality check on discussions of how to improve yields in the next decade. 

Tuesday, May 13, 2014

Miraculous growth in Africa?


There has been a lot of chatter over the last few years about the purported African growth "miracle" -- the stylized fact that African economies seem to have really picked up speed over the last decade.  I am not really certified to comment on whether something is a miracle (maybe that certification comes with the genius award?), but reviewing the broader patterns and determinants of recent African growth seemed like a useful exercise to this blogger, and hopefully will be as well to the vaunted G-FEED readership. 

Seems like there are two basic questions:

  1. How has Africa (and here we really mean Sub-Saharan Africa) performed relative to how it has in the past, or to how other countries have recently?
  2. What explains these patterns of performance?
To answer question #1, I pulled the most recent national accounts data on per capita GDP from the World Bank's World Development indicators.  These are plotted below. On the left is how real GDP per capita has evolved in SSA over the last half century, and on the right is the growth rate in real GDP per capita for SSA (shown in red) as well as a number of other developing and developed regions.  

A few things are immediately apparent.  First, Africa does seem to have had a good decade, at least relative to the few before it. Real per capita GDP growth has averaged around 2% for the entire decade, and average per capita incomes across Sub-Saharan Africa are now higher than they have ever been in history. Second, though, Africa's recent performance does not appear to be a "miracle" relative to what other countries have accomplished, as can be seen by comparing recent African growth rates with growth in other developing regions in the right plot.  Nor is its performance an obvious miracle relative to what it has accomplished during past decades.  The continent has had growth surges before, for instance growing at about 2%/year in per capita terms for most of the 1960s and 1970s.  Indeed, per capita incomes only recovered their previous mid-70s after about 2005, and are now only slightly higher than they were 30 years ago. 

Some people do not trust these national accounts data and have tried to use other data sources to get at the same question.  For instance, Alwyn Young uses wage and asset data from the Demographic and Health Surveys to reconstruct how consumption has changed over time, and concludes that the story is even better than what national accounts data would suggest (but these findings have been seriously questioned, for instance here).  Henderson, Storeygard, and Weil (2012) in a really cool recent paper use satellite data on the brightness of nighttime lights as a proxy for economic activity.  Although they do not report a SSA-wide growth number, they show that GDP growth derived from night lights is substantially higher than national accounts estimates for a substantial number of African countries (although not all of them), perhaps again suggesting that the national accounts data understate the improvements.

So the overall conclusion seems to be that while recent African performance is not off-the-charts good, either compared to past history on the continent or to other regions over the last few decades, the continent did seem to have a good decade. 

So on to question 2:  what explains the uptick in recent performance?  Here the rhetoric about miracles seems particularly unhelpful, given that it's not clear that we'd want Africa's recent performance to be a "miracle" in the Merriam-Webster sense of "an extraordinary event manifesting divine intervention in human affairs".  It would instead be nice if recent growth was instead a function of concerted improvements in some of the suspected constituents of growth:  institutions that protect and support political and economic rights, policies that help capital flow to high-return investments, long term investments in agriculture and education that have started to bear fruit, etc. 

Perhaps not surprisingly, it turns out there is still a lot of disagreement over why Africa has done better in the 2000s relative to the 1980's and 1990s. 

Here's the upbeat news from the consultancy McKinsey in an attractively titled report "Lions on the move": "Africa's economic pulse has quickened, infusing the continent with a new commerical vibrancy....  Africa's growth acceleration resulted from more than a resource boom.  Arguably more important were government actions to end political conflicts, improve macroeconomic conditions, and create better business climates, which enabled growth to accelerate broadly across countries and sectors."

Compare that to the view of long-time Africa hand Michael Lipton:  "Stop kidding ourselves. Faster GDP growth in Africa since 2000 is mainly a mining boom, with dubious benefits. Staples yields (and labour productivity) have not reversed the dismal trends that Peter Timmer diagnosed two decades ago: big, credible rises are seen in only a few African countries (e.g. Rwanda, Ghana). The populous ones (Ethiopia, Nigeria, maybe Kenya, above all DR Congo) tell a sad tale."

National accounts data can again be used to shed some light on this debate.  Below are a couple plots from a periodic Africa-focused World Bank report called "Africa's Pulse".  The top one shows how resource-rich countries performed relative to non-resource rich countries, and the bottom one shows the sectoral composition of growth in the fast-growing African countries.




These data seem to suggest a story in between the Lipton view and the McKinsey view:  growth was substantially higher in resource-rich countries (both oil and non-oil) relative to non resource rich countries, although as the report explains there were some notable performers in the latter group (Rwanda, Ethiopia). And as shown in the bottom plot, all sectors grew over the period since the mid 1990s.  For those interested in agriculture, there is also recent evidence that agricultural productivity was also on the rise in many, though not all, African countries;  total factor productivity (total outputs per total inputs) improved in Eastern and Southern African over the last few decades but was flat or declined in much of Central Africa, the Sahel, and the Horn. 

But just looking at broader sectoral patterns of growth does not really answer the fundamental question about what caused growth.  For instance, observed "broad based" patterns of growth -- i.e. the fact that resource sectors and non-resource sectors (services, manufacturing) both appeared to be growing -- could still be consistent with resources booms that increase demand for local non-tradeables.  Indeed, the fact that services grew much faster than manufacturing (and, with resources, grew as a proportion of the economy while ag and manufacturing shrank) is some evidence that this might the case.  Was it just the case that higher mineral prices gave miners more money and they decided to buy more sandwiches and haircuts?  It's hard to know. 

Similarly, concluding that productivity growth was driven by a declining share of the labor force in agriculture, as argued by a recent NBER paper, does not necessarily tell us what the driver was of this declining share. The fact that most people that left farming ended up in the informal services sector is again not obvious evidence of broad-based or sustainable growth.

Maybe some readers know of better-identified studies of the sources of recent productivity growth in African economies?  In the absence of these studies, it seems right to conclude that Africa is doing better, but it seems pretty hard to know what to make of this improvement -- where it came from, and whether it might last.  

Thursday, May 1, 2014

The FBE Book and Bourlaug’s 100th

A post a few months ago highlighted a new book coming out by Fischer, Byerlee, and Edmeades (FBE) on crop yield trends and food security. It has now been officially published and available on this website. I definitely recommend it to anyone interested in the topic of food security or global crop yield prospects. They also have a lot of handy figures and tables and maps for teaching.

Also, while I'm on the topic of unsolicited reading suggestions, last month there was a symposium in Mexico to celebrate Borlaug’s 100th year birthday. I was bummed that I had to decline an invitation to speak - one of the opportunity costs of working on the IPCC, which had it's plenary approval session in Japan at the same time. But luckily they have posted all of the presentations from the meeting here. If you want a cliff notes version of some of FBE, both Fischer and Byrelee presented at the Borlaug meeting and their slides are up there too.

A flood of crop data, and a study on drought

As the growing season gets under way in the Midwest, we have a paper out today in Science on yield trends in the region, and the role of drought. The paper is here and a summary story here. (I'll also post a link asap to a pdf for those without access to Science here). There's also a very nice companion piece by Don Ort and Steve Long here.

The paper relates to this post a while back on yield sensitivity. The short story is that lots of things could be making crops more or less sensitive to weather, so it’s an interesting empirical question as to whether overall sensitivity is actually rising or falling. And it seems a useful question, at least in the context of estimating the costs of climate change, because if sensitivity is declining then it would mean impacts of climate change might be overstated (i.e. systems are adapting), whereas if sensitivity is growing it would mean impacts could be bigger for future cropping systems than the current ones (what’s the opposite of adapting? Is anti-adapting a word?).

The new paper shows that for corn in the US, sensitivity to drought appears to be rising. Yields now seem about twice as sensitive to hot, dry conditions as they were 20 years ago. It’s not that farmers get worse yields in drought years than they used to, but they are really making fast progress under better conditions. Progress has been much slower for hot, dry conditions, which we think may indicate farmers and their crops are pushing the limits of what’s possible in those years. Or at least what's possible with current technology (Ort and Long mention this paper about how changing canopy structure might help, and there are other ideas out there too. Maybe a topic for another day).

See the paper for more details. What I wanted to focus this post on was something that may get lost in the paper's and coverage's emphasis on the "big picture", which is how cool the dataset we worked with is. And it’s posted here or here for people to play with if they want. The animation below gives a sense for the yield data for three of the states over time (red=high yields, purple=low). There’s over a million observations of corn and soy fields around the country since 1995, and associated weather variables. (We are not allowed to release anything that could be used to identify farmers, so the fields are random samples of a larger dataset and don’t have location information beyond the county level.)



The study also relates to this previous post on how cool the MARS approach to data analysis is. It played a very helpful role in this study, and it was surprising just how well it picked up effects of weather. The figure below shows the six factors identified as important, and the functional form identified. The most important (consistent with previous work) is high vapor pressure deficits in the period of 61-90 days after sowing, which is around flowering for corn. The MARS model with just these six variables explained about 32% of the variance in the dataset (a number we somehow neglected to mention in the paper!). That’s an awful lot of variance for a field level dataset with all sorts of variation going on, including in soils and off-season weather. I’d be curious if any other simple model could do as well (the data are there if you want to have a go at it).  


(Update 1: An interview I did is here, which are things I actually said, as opposed to what I'm sometimes quoted as saying based on a reporter's memory or attempt to jumble three things into one sentence.)