We’ve had a couple of papers come out that might be of
interest to readers of the blog. Both relate to using satellites to measure
crop yields for individual fields. This type of data is very hard to come by
from ground-based sources. In many places, especially poor ones, farmers simply
don’t keep records on production on a field by field basis. In other places,
like the U.S., individual farmers often have data, and government agencies or
private companies often have data for lots of individuals, but it’s typically
not made available to public researchers.
All of this is bad for science. The less data you have on
yields, the less we can understand how to improve yields or reduce
environmental impacts of crop production. We now know exactly how much a catcher’s glove moves each time they frame a pitch, exactly how well every NBA player shoots from every spot on the floor, and exactly how fast LeBron James
flops to ground when he is touched by an opposing player. And all of this knowledge
has helped improve team strategies and player performance.
But for agriculture, with all the talk of “big data”, we are
still often shooting in the dark. And in some cases it’s getting worse, such as
this recent
post at farmdoc daily about how the farmer response rate to area and
production surveys is fading faster than my hairline (but not quite as fast as
Sol’s):
As we’ve discussed before on this blog (e.g., here and
here),
satellites offer a potential workaround. They won’t be perfect, but anyone who
has done phone or field surveys, or even crop cuts within fields, knows that
they have problems too. Plus, satellites are getting cheaper and better, and
it’s also becoming easier to work with past satellite data thanks to platforms
like Google Earth Engine.
Now to the studies. Both employ what we are calling SCYM,
which stands either for Scalable Crop Yield Mapper or Steph Curry’s Your MVP, depending
on the context. In the first context, the basic idea is to eliminate any
reliance on ground calibration by using simulations to calibrate regression
models.
The first study (joint with George Azzari) applied SCYM to
Landsat data in the Corn Belt in the U.S. for 2000-2015, and then analyzed the
resulting patterns and trends for maize and soybean yields. Here’s the link to
the paper, and a brief animation we
had made to accompany the story. Also, we’ve made the maps of average yields
available for people to visualize and inspect here.
Below is a figure that summarizes what I thought was the most interesting
finding: that maize yield distributions are getting wider over time.
Overall maize yields are still increasing, but mainly this
is driven by growth in high yielding areas. This is true at the scale of
counties, and also within individual fields. The paper discusses reasons why (e.g.
uptake of precision ag) and possible implications. Maybe I’ll return to it in a
future post. A short side note is that since that paper was written we’ve
expanded the dataset to a nine state area, and made some useful tweaks to SCYM to
improve performance.
The second paper (joint
with Marshall) looks at yield mapping with some of the newer high-res sensors
in smallholder systems. In this case, we looked at two years of data in Western
Kenya, using detailed surveys with hundreds of farmers to test our yield
estimates, which were derived using 1m resolution imagery from Terra Bella. The
results were pretty good, especially when you consider only fields above half
an acre, where problems with self-reported yields are not as big. Just like in the US, the satellites reveal a lot of yield heterogeneity both within and between fields (the top image shows the true-color image, the bottom shows the derived yield estimates for maize fields):
One
interesting result is that in many ways it looks like the satellite estimates
are no less accurate than the (expensive) field surveys. For example, when we correlate
yields with inputs that we think should affect yields, like fertilizer or seed
density, we see roughly equal correlations when using satellite or self-report
yields.
None of this is to say that satellite-based yields are
perfect. But that was never the goal. The goal is to get insights into what
factors are (or aren’t) important for yields in different settings. Both
studies show how satellites can provide insights that match or even go beyond
what is possible with traditional yield measures. We hope to continue to test and
apply these approaches in new settings, and plan to make the datasets available
to researchers, probably at this site.
Thank you for this excellent post! I have a question: why are you using GCVI instead of WDRVI for relative high LAI values?
ReplyDeleteThanks for comment. It was mainly because gcvi captures some effects of different chlorophyll content, which is useful in africa. in US we originally used WDRVI but switched to GCVI, which worked slightly better
ReplyDelete