The next person that says big data, puts a fiver in the "most overused terms in meetings in the year 2014" jar. I am excited about the opportunities of ever larger micro datasets, but even more thrilled by how much thought is going into the visualization of these datasets. One of my favorite macho nerd blogs Gizmodo just put up a number of the 2014 best data visualizations. If you also think that these are of so purdy, come take Sol's class at GSPP where he will teach you how to make graphs worthy of this brave new word.
Monday, December 22, 2014
Predictions, paradigms, and paradoxes
Failure can be good. I don’t mean in the “learn from your
mistakes” kind of way. Or in the “fail year after year to even put up a good
fight in the Big
Game” kind of way. But failure is often the sidekick of innovation and
risk-taking. In places with lots of innovation, like Silicon Valley, failure
doesn’t have the stigma it has in other places. Because people understand that
being new and different is the best way to innovate, but also the best way to
fail.
Another area that failure can be really useful is in making
predictions. Not that bad predictions are useful in themselves, but they can be
useful if they are bad in different ways than other predictions. Then averaging
predictions together can result in something quite good. The same way that a
chorus can sound good even if each person is singing off key.
We have a paper
out today that provides another example of this, in the context of
predicting wheat yield responses to warming. By “we” I mean a group of authors
that contributed 30 different models (29 “process-based” models and 1
statistical model by yours truly), led by Senthold Asseng at U Florida. I think
a few of the results are pretty interesting. For starters, the study used previously
unpublished data that includes experiments with artificial heating, so the
models had to predict not only average yields but responses to warming when
everything else was held constant. Also, the study was designed in phases so
that models were first asked to make predictions knowing only the sow date and
weather data for each experiment. Then they were allowed to calibrate to
phenology (flowering and maturity dates), then to some of the yield data. Not
only was the multi-model median better than any individual model, but it didn’t
really improve much as the models were allowed to calibrate more closely (although
the individual model predictions themselves improved). This suggests that using
lots of models makes it much less important to have each model try as hard as
they can to be right.
A couple of other aspects were particularly interesting to
me. One was how well the one statistical model did relative to the others (but
you’ll have to dig in the supplement to see that). Another was that, when the
multi-model median was used to look at responses to both past and projected
warming at 30 sites around the world, 20 of the 30 sites showed negative
impacts of past warming (for 1981-2010, see figure below). That agrees well with
our high level statement in the IPCC report that negative impacts of warming
have been more common than positive ones. (I actually pushed to add this
analysis after I got grilled at the plenary meeting on our statement being
based on too few data points. I guess this an example of policy-makers feeding
back to science).
Add this recent paper to a bunch of others showing how
multi-model medians perform well (like here,
here,
and here),
and I think the paradigm for prediction in the cropping world has shifted to
using at least 5 models, probably more. So what’s the paradox part? From my
perspective, the main one is that there’s not a whole lot of incentive for
modeling groups to participate in these projects. Of course there’s the benefit
of access to new datasets, and insights into processes that can be had by
comparing models. But they take a lot of time, groups generally are not
receiving funds to participate, there is not much intellectual innovation to be
found in running a bunch of simulations and handing them off, and the resulting
publications have so many authors that most of them get very little credit. In
short, I was happy to participate, especially since I had a wheat model ready
from a previous paper, but it was a non-trivial amount of work and I don’t
think I could advise one of my students to get heavily involved.
So here’s the situation: making accurate predictions is one
of the loftiest goals of science, but the incentives to pursue the best way to make predictions (multi-model ensembles) are slim in most
fields. The main areas I know of with long-standing examples of predictions
using large ensembles are weather (including seasonal forecasts) or political
elections. In both cases the individual models are run by agencies or polling
groups with specific tasks, not scientists trying to push new research frontiers.
In the long-term, I’d guess the benefit of better crop predictions on seasonal
to multi-decadal time scales would probably be worth the investment by USDA and
their counterparts around the world to operationalize multi-model approaches. But relying on the existing incentives for
research scientists doesn’t seem like a sustainable model.
Wednesday, December 10, 2014
Adapting to extreme heat
Since we are nearing the holidays, I figured I should write something a bit more cheerful and encouraging than my standard line on how we are all going to starve. My coauthor Michael Roberts and I have emphasized for a while the detrimental effect of extreme heat on corn yields and the implications for a warming planet. When we looked at the sensitivity to extreme heat over time, we found an improvement (i.e., less susceptibility) roughly around the time hybrids were introduced in the 30s, but that improvement soon vanished again around the 1960s. Heat is as susceptible to heat now as it was in 1930. Our study simply allowed the effect of extreme heat to vary smoothly across time, but wasn't tied to a particular event.
David Popp has been working a lot on innovation and he suggested to look at the effect of hybrid corn adaptation on the sensitivity to extreme heat in more detail. Richard Sutch had a nice article on how hybrid corn was adopted slowly across states, but fairly quickly within each state. David and I thought we could use the fairly rapid rollout within state but slow rollout across state as source of identification of the role of extreme heat. Here's a new graph of the rollout by state:
The first step was to extended the daily weather data back to 1901 to take a look at the effect of extreme heat on corn yields over time - we wanted a pre-period to rule out that crappy weather data in the early 1900s results in a lot of attenuation bias but get significant results with comparable coefficients when we use data from the first three decades of the 20th century.
In a second step we interact the weather variables with the fraction of the planted area that is hybrid corn. We find evidence that the introduction of corn is reducing the sensitivity of hybrid corn from -0.53 to -0.33 in the most flexible specification in column (3b) below, which is an almost 40% reduction. Furthermore, the sensitivity to precipitation fluctuations seems to diminish as well. (Disclaimer: these are new results, so they might change a bit once I get rid of my coding errors).
The table regresses state-level yields in 41 states on weather outcomes. All regressions include state-fixed effects as well as quadratic time trends. Columns (b) furthermore include year fixed effects to pick up common shocks (e.g., global corn prices). Columns (1a)-(1b) replicate the standard regression equation we have been estimating before, columns (2a)-(2b) allow the effect of extreme heat to change with the fraction of hybrid corn that is planted, while columns (3a)-(3b) allow the effect of all four weather variables to change in the fraction of hybrid corn.
In summary: there is evidence that at least for the time period when hybrid corn were adopted that innovation in crop varieties lead to an improvement in heat tolerance, which would be extremely useful as climate change is increasing the frequency of these harmful temperatures. On that (slightly more upbeat note): happy holidays.
David Popp has been working a lot on innovation and he suggested to look at the effect of hybrid corn adaptation on the sensitivity to extreme heat in more detail. Richard Sutch had a nice article on how hybrid corn was adopted slowly across states, but fairly quickly within each state. David and I thought we could use the fairly rapid rollout within state but slow rollout across state as source of identification of the role of extreme heat. Here's a new graph of the rollout by state:
The first step was to extended the daily weather data back to 1901 to take a look at the effect of extreme heat on corn yields over time - we wanted a pre-period to rule out that crappy weather data in the early 1900s results in a lot of attenuation bias but get significant results with comparable coefficients when we use data from the first three decades of the 20th century.
In a second step we interact the weather variables with the fraction of the planted area that is hybrid corn. We find evidence that the introduction of corn is reducing the sensitivity of hybrid corn from -0.53 to -0.33 in the most flexible specification in column (3b) below, which is an almost 40% reduction. Furthermore, the sensitivity to precipitation fluctuations seems to diminish as well. (Disclaimer: these are new results, so they might change a bit once I get rid of my coding errors).
The table regresses state-level yields in 41 states on weather outcomes. All regressions include state-fixed effects as well as quadratic time trends. Columns (b) furthermore include year fixed effects to pick up common shocks (e.g., global corn prices). Columns (1a)-(1b) replicate the standard regression equation we have been estimating before, columns (2a)-(2b) allow the effect of extreme heat to change with the fraction of hybrid corn that is planted, while columns (3a)-(3b) allow the effect of all four weather variables to change in the fraction of hybrid corn.
In summary: there is evidence that at least for the time period when hybrid corn were adopted that innovation in crop varieties lead to an improvement in heat tolerance, which would be extremely useful as climate change is increasing the frequency of these harmful temperatures. On that (slightly more upbeat note): happy holidays.
Subscribe to:
Posts (Atom)