Monday, October 27, 2014

One effect to rule them all? Our reply to Buhaug et al's climate and conflict commentary

For many years there has been a heated debate about the empirical link between climate and conflict. A year ago, Marshall, Ted Miguel and I published a paper in Science where we reviewed existing quantitative research on this question, reanalyzed numerous studies, synthesized results into generalizable concepts, and conducted a meta-analysis of parameter estimates (watch Ted's TED talk).  At the time, many researchers laid out criticism in the press and blogosphere, which Marshall fielded through G-FEED. In March, Halvard Buhaug posted a comment signed by 26 authors on his website strongly critiquing our analysis, essentially claiming that they had overturned our analysis by replicating it using an unbiased selection of studies and variables. At the time, I explained numerous errors in the comment here on G-FEED

The comment by Buhaug et al. was published today in Climatic Change as a commentary (version here), essentially unchanged from the version posted earlier with none of the errors I pointed out addressed. 

You can read our reply to Buhaug et al. here. If you don't want to bother with lengthy details, our abstract is short and direct:
Abstract: A comment by Buhaug et al. attributes disagreement between our recent analyses and their review articles to biased decisions in our meta-analysis and a difference of opinion regarding statistical approaches. The claim is false. Buhaug et al.’s alteration of our meta-analysis misrepresents findings in the literature, makes statistical errors, misclassifies multiple studies, makes coding errors, and suppresses the display of results that are consistent with our original analysis. We correct these mistakes and obtain findings in line with our original results, even when we use the study selection criteria proposed by Buhaug et al. We conclude that there is no evidence in the data supporting the claims raised in Buhaug et al.
The reply itself explains in detail why the claims in Buhaug et al are unsupported.  It then dissects what Buhaug et al did in their revision of our analysis, which was based on our publicly available replication code.  We detail 5 major errors in the analysis, all of which make the results in Buhaug appear to contradict our original results more strongly than if the analysis were executed correctly. We then correct the analysis in Buhaug et al, using the study selection criteria they propose, and find results that are in line with our original analysis. 

In their comment, Buhaug et al. raise the concern that the lagged effect of climatic variables on conflict may not have adequately been captured by our original analysis, so in our reply we implement an entirely new analysis where we account for both the current and lagged effect of climatic variables on conflict. This is the calculation that Buhaug et al should have done, rather then the calculation they implemented which ignores the effect of the current climate on the likelihood of current conflict.  For readers interested in this issue, I point you towards our recent NBER working paper where we fully address the lag structure of the relationship between climatic variables and conflict in an updated meta-analysis. This new analysis distinguishes between the current and lagged effects of both temperature and rainfall with respect to both intergroup and interpersonal conflict. The summary of that analysis is this figure:

Summary of meta-analysis for studies with distributed lag structure. Estimated precision-weighted mean effects and 95% confidence intervals for intergroup (left panel) and interpersonal conflict (right panel), for both contemporaneous and one period lagged temperature (red, left offset) and precipitation (blue, right offset). "Combined" effects equal the sum of the contemporaneous (0 lag) and one period lagged effects for studies where the calculation was possible.  The number of studies contributing to each estimate is given in parentheses. From Burke, Hsiang, Miguel (2014).
Rather than do this kind of clear and disaggregated analysis of lags, Buhaug et al. simply change which lags they use so that they are not actually analyzing the same data that was in our original analysis. Instead of looking at the contemporaneous effect of climate variables on conflict, as is the focus of our original work and most of the literature, Buhaug et al only present the effect of lagged variables, which much of the literature suggests is relatively unimportant. In the above figure, they would be focusing on the effects marked "1 lag," which in the case of temperature (red) is obviously very different than the contemporaneous effect marked "0 lag". We provide two examples of this in our reply:

We find it strange that Buhaug et al focus on results that researchers do not claim are important. However, a more critical issue is that Buhaug et al. compare these lagged effects to the contemporaneous effects in our original analysis as if they are the same thing, when they clearly are not. Both figures above make it clear that this is an apples-to-oranges comparison, and showing that the lagged effect is nearer to zero (as Buhaug et al report) is in no way at odds with our earlier findings, rather it is completely consistent with our results above. But claiming as Buhaug et al do that a small lagged effect overturns a prior result for a large contemporaneous effect makes no sense, since they could easily both be true. The comparison is a red herring.

One of the other criticisms of Buhaug et al is that we omitted a large number of key studies that satisfied our selection criteria and would have altered our results had we included them, although we do not know what studies they are referring to and the authors do not offer a list. But just to be safe, we designed a stress test to see how biased we would have needed to be in order to obtain our results through biased study selection alone. We repeat our analysis over and over, each time adding in a duplicate observation for the study by Thiesen, Holterman, and Buhaug (2011), which was the study that most strongly disagreed with our central findings. By adding multiple versions of that study to our sample, we are trying to simulate a scenario in which many unknown missing studies that disagree with our results are "found" and added to our analysis. The results look like this (whiskers are confidence intervals):

We continue to obtain a statistically significant result similar to our published estimates until we include 80 studies with findings identical to Theisen, Holterman & Buhaug. This is remarkable because our original analysis contained 21 estimates. Thus, in order for our results to be driven by biased study selection alone, we would have had to miss 4 out of every 5 studies in the literature, with each of those missing studies having results similar to the most negative result in the sample. This seems implausible, leading us to reject the suggestion by Buhaug et al that our conclusion is an artifact of biased study selection.

We are also concerned about the choices made by Buhaug et al. when deciding how to display their revision of our analysis.  Buhaug et al. worked off of our replication files, so the adjustment of study selection by Buhaug et al should not have changed how confidence intervals and final results were displayed. Yet the statistical visualization that summarizes the main findings in our original analysis is visually mimicked by the figure in Buhaug et al, but the new figure does not show analogous statistical objects.  The figure below directly contrasts our original figure, as published in Science, with the new analogous figure just published by Buhaug et al.

Click to enlarge 

The two changes shown above mean that the statistical significance of the results in Buhaug et al are never actually displayed, as they would have been shown had the display format using our original figure been retained. Had the statistical significance of the mean effect been shown, it would have been clear that the mean estimate was statistically significant, in agreement with our original analysis.

This correction is explained in detail in our reply comment and the corrected version of Buhaug et al's figure (taken from our reply comment) is:

Click to enlarge

More fundamentally, it is objectively wrong to use the range of the data to make inferences about statistical uncertainty regarding a mean or a median, as Buhaug et al wish to do.  This is not a choice over which reasonable people can disagree, it reflects a fundamental misunderstanding of basic statistics. For clarity, consider a simpler example: imagine a sample of one hundred individuals with ages 1, 2, 3 up to 100 years old. The estimated mean age of the sample would be 50.5 years old (95% confidence interval: 44.7-56.3 years old). Buhaug et al’s approach would suggest that the mean age could be 4 years old or 96 years old, since the 2.5-97.5 centiles of the data span these values.

We have gone to great lengths on numerous occasions to inform the Buhaug et al. team of these numerous errors in their analysis. Every time they have dismissed us and have not corrected their comment.

In August, 2013 we provided a reply to an earlier version of the Buhaug et al. comment that was submitted to Science. Our reply detailed numerous errors of representation and interpretation in their text, and included the "omitted study" simulation above.  Buhaug et al's comment was not published. To ensure that all 25 authors (the 26th was added later) were aware of these various issues, I personally emailed a copy of our reply to all 25 authors on October 25, 2013. None of the authors replied with a response related to the content of the comment. 

I was then asked to respond to the revised comment for Climatic Change. The revised comment presented the altered meta-analysis from our original article and included a 26th author. My subsequent response to Climatic Change was exceptionally thorough (3,800 words, 9 pages) and explained in detail all of the major points that we have included in our response, including the inaccurate portrayal of our original analysis and a detailed mathematical discussion of how lagged effects are treated in the literature. At that time, I also pointed out that basic data errors were were present in the analysis by Buhaug et al, stating (emphasis added),
"it is completely unclear to me how the numbers presented were generated as the authors do not provide enough information to replicate their analysis, and casual inspection suggests that the authors have either made basic mathematical errors in transcription or they are misrepresenting the results of studies by reporting coefficients that are known to be irrelevant, or both.  Thus, the mathematical result presented in the new meta-analysis contains at least one of these fatal errors."
In our view, this was sufficient information that at least some members of the 26 author team should have re-checked their analysis for errors if they intended to pursue publication of these calculations.  

Despite the clear communication that errors were present in their analysis, Buhaug et al. dismissed the substance of my response, simply replying that I did not understand what the comment was about. Buhaug et al. did not address the major errors that I had voluntarily explained in detail.  On March 19, 2014, we were notified by Google Scholar that Halvard Buhaug had posted online an uncorrected version of their comment, where it remains today complete with statistical and coding errors.

Thus the Buhaug et al. team had unilaterally decided to publish these erroneous calculations online, having been informed three times that they exhibit clear basic errors by us.  Since that published version of the comment is now cached on the internet, is searchable, and has already been cited (for example here), I posted my 9 page response on G-FEED on March 21, 2014 here. As of writing, that post has 1,348 pageviews.

Despite all of these overtures at communicating their errors to them, the author team has continued to not alter their comment.

To be clear:  we welcome fruitful discussion and disagreement about our results in this and related analysis.  But what Buhaug et al have done here is not fruitful:  it is full of errors, misunderstandings, and sloppy mistakes.  It does not move the debate forward. 

No comments:

Post a Comment