Thursday, May 22, 2014

Regressing to the mean

As a general rule, people like to take credit when good things happen to them, but blame bad luck when things go wrong. This probably helps us all get through the day, feeling better about ourselves and other people we like. For example, I’d like to think that the paper rejection I got last month was bad luck, while the award I got last year was totally deserved. But anyone who pays attention to professional sports, or the stock market, knows that success in any single day or even year has a lot to do with luck. It’s not that luck is all you need, but it often makes the difference between very evenly-matched competitors. That’s why people or teams who perform particularly well in one year tend to drop back the next, a.k.a. regressing to the mean.

Take tennis. There’s clearly a skill separation between professionals and amateurs, and between the top four or five professionals and everyone else. But among the top, it’s hard to know who will win on any day, and it’s very hard to sustain a streak of victories against other top players. So even when someone like Raphael Nadal, who has some remarkable winning streaks, talks about how he got a few key bounces, he’s as much being an astute observer as a gracious winner.

Or the stockmarket. It’s well known that even the best fund managers have a hard time outperforming the market for a long time. Even if someone has beaten it five years in a row, there’s a good chance it was just luck given how many fund managers are out there. I don’t tend to watch interviews with fund managers as much as athletes, but something tells me they might be a little less inclined to turn down the credit.

So what about agriculture? It’s not a source of entertainment or income for most of us, so we don’t spend much time thinking about it, and you won’t find any posts on fivethirtyeight about it. But one thing that struck me early on working in agriculture is how farmers are just as prone to thinking they are above average as the rest of us. More specifically, if you ask why some fields around them don’t look as good, they will talk about how that farmer is lazy, has another job, doesn’t take care of his soil, etc.

As far as I can tell, this isn’t a purely academic question. Understanding how much farmers vary in their ability to consistently produce a good crop is important if you want to know the best ways of improving agricultural yields. If there really are a bunch of star performers out there, then letting them take over from the laggards by buying up their land, or training other farmers to use best-practices, could be a good source of growth in the next decade. For example, here’s a cool presentation about a new effort in India to have farmers spread videos of best practices through social networks.

There’s a fairly obvious but not perfect link to the idea of yield gaps. People who say that closing yield gaps are a big “low-hanging fruit” for yield improvement often have a vision of better agronomy being able to drastically raise yields. The link isn’t perfect, because it could be that even the best farmers in a region are underperforming, agronomically speaking, for instance if fertilizers are very expensive. And it could be that some farmers consistently out-perform not because of better management, but because they are endowed with better soils (though this can be sorted out partly by using data on soil properties). Even with these caveats, understanding how much truly better the “best” farmers are could help give a more realistic view of what could be achieved with improved agronomy.

The key here is the “how much” part of it. Nobody can argue that some farmers aren’t better than others, just as nobody can say that Warren Buffet isn’t better than an average fund manager. The question is whether this is a big opportunity, or if it’s best to focus efforts elsewhere. I’ve been trying to get a handle on the “how much” over the years by using satellites to track yields over time. I’ve used various ways of trying to display this in a simple way, but nothing was too satisfying. So let me give it another shot, based on some suggestions from Tony Fischer during a visit to Canberra.

The figures below show wheat yield anomalies (relative to average) for fields ranked in the top 10% for the first year we have satellite data (green line). Each panel is for a different area, and soils don’t vary too much within each area. Then we track those fields over the following years to see if those fields are able to consistently outperform their neighbors. Similarly we can follow fields in any other group, and I show both the bottom decile (0-10% in blue) and the fifth (40-50% in orange). The two horizontal lines show the mean yield anomaly in the first group for the year they ranked in the top, and their mean yield anomaly in all the other years. If it was all skill (or something else related to a place like the soil quality) the second line would be on top of the first. If it was all luck, the second line would be at zero.

So what’s the verdict? The top performers in the first year definitely show signs of regressing to the mean, as their mean yield drops much closer to the overall average in other years. Similarly, the worst performers “regress” back up toward the mean. But neither jump all the way back to zero, which says that some of the yield differences are persistent. In the two left panels, the anomalies are a little more than one-third the size they were in the initial year. In the right panel, the anomalies are about half their original value, indicating relatively more persistence. That makes sense since we know the right panel is a region where some farmers consistently sow too late (see this paper for more detail).

So a naive look at any snapshot of performance over a year or two would be way too optimistic about how big the exploitable yield gap might be. It’s important to remember that performance tends to regress to the mean. At the same time, there are some consistent differences that amount to roughly 10% of average yields in these regions (where mean yield is around 5-6 tons/ha). And with satellites we can pinpoint where the laggards are and target studies on what might be constraining them. At least that's the idea behind some of our current work. 

Whether other regions would look much different than the 3 examples above, I really don’t know. But it shouldn’t be too hard to find out. With the current and upcoming group of satellites, we now have the ability to track performance on individual fields in an objective way, which should serve as a useful reality check on discussions of how to improve yields in the next decade. 

No comments:

Post a Comment