I try not to use this blog much for rants, but it’s an easy
way to post more frequently. So… we’ve been getting some feedback on our recent paper on drought stress in U.S. maize, some of it positive, some not. One thing
that comes up, as with a lot of prior work, is doubts about how important
temperature is. Agronomists often talk about how important the timing of
rainfall is. And about how heat doesn’t hurt nearly as much if soils are very moist.
To me this is another way of saying (1) other factors matter too and (2) there are interactions between temperature and other factors. The answer to both of
these is “Of course!”
I am struck by the similarity of these discussions to those
that Marshall posted about the empirical work on conflict. They go something
like this:
Person A: “We’ve looked at the data and see a clear response that people tend to wake up when you turn the lights on”
Person B: “But people also wake up because they went to bed early last night and aren’t tired any more.”
A: “Yeah, ok”
B: “And the lights probably won’t wake them up if they are passed out drunk.”
A: “Ok, great point”
B: “I’ve woken up thousands of times in my 30-year career, and rarely did I get woken because someone turned the light on”
A: “ok”
B: “So then how can you possibly claim that turning on lights causes people to wake up”
A: “What? Are you serious? Did you even read the paper?”
B: “I don’t need to, I am an expert on waking up.”
And so on. I sometimes don’t know whether people seriously
don’t understand the difference between explaining some vs. all of the
variance, or if they just look for any opportunity to plug their own area of
expertise. When we claim to see a clear signal in the data, it is not a claim
that there is no noise around that signal. And some of the noise might include interactions with other variables. In fact, if there wasn’t any noise then the signal would have been known long ago.
The other day I was thinking of replacing my old tennis racquet,
so I went to Google and typed in “prince thunder” (the name of my current
racquet). Turns out the top results were
about a song that the artist Prince wrote called Thunder. Does that mean that
Google is entirely useless? No, it means there is some error if you try to
predict what someone wants based only on a couple of words they type. But with
their enormous datasets, they are pretty good at picking up signals and getting
the answer right more often than not. For most people typing those words, they
were probably looking for the song.
Back to the crop example. Of course, heat will matter less
or more depending on the situation. And of course getting rainfall right before
key stages is more important than getting it after. These are both reasons for
the scatter in any plot of heat (or any other variable) against yields. But
neither of those refutes the fact that higher temperatures tend to result in
more water stress, and lower yields. Or in Sol and Marshall’s case, that higher
temperatures tend to increase the risk of conflict.
I sometimes think if one of us were to discover some magic
combination of predictors that explained 95% of the variance in some outcome,
there would be a chorus of people saying “you left out my favorite 5%!” Don’t
get me wrong, there are lots of legitimate questions about cause and effect,
stationarity, etc. that are worth looking into. But how much time should we
really spend having the same old conversation about the difference between a
signal and noise?
great post!
ReplyDelete