Monday, September 26, 2016

Errors drive conclusions in World Bank post on ivory trade and elephant poaching: Response to Do, Levchenko, & Ma

(Spoiler alert: if you want to check your own skills in sniffing out statistical errors, skip the summary.) 

Summary: 


In their post on the World Bank blog, Dr. Quy-Toan Do, Dr. Andrei Levchenko, and Dr. Lin Ma (DLM for short) make three claims that they suggest undermine our finding that the 2008 one-time sale of ivory corresponded with a discontinuous 66% increase in elephant poaching globally. First, they claim that the discontinuity vanishes for sites reporting a large number of carcasses—i.e., there was only a sudden increase in poaching in 2008 at sites reporting a small number of total carcasses. Second, they claim that price data for ivory (which they do not make public) from TRAFFIC do not show the increase in 2008 that they argue would be expected if the one-time sale had affected ivory markets. Third, they claim that re-assigning a seemingly small proportion of illegal carcasses counted in 2008 to be legal carcasses makes the finding of a discontinuity in 2008 less than statistically significant, and they speculate that a MIKE (CITES) initiative to improve carcass classification may explain the discontinuous increase in poaching in sites with small numbers of carcasses.

In this post, we systematically demonstrate that none of these concerns are valid. First, as it turns out, the discontinuity does hold for sites reporting a large number of carcasses, so long as one does not commit a coding error that causes a systematic omission of data where poaching was lower before the one-time sale, as did DLM. Furthermore, we show that DLM misreported methods and excluded results that contradicted their narrative, and that they made other smaller coding errors in this part of their analysis. Second, we note that, notwithstanding various concerns we have about the ivory price data, our original paper had already derived why an increase in poaching due to the one-time sale would not have a predictable (and possibly no) effect on black market ivory prices. Finally, we note that (i) DLM provide no evidence that training on carcass identification could have led to a discontinuous change in PIKE (in fact, the provided evidence contradicts that hypothesis), and that (ii) in the contrived reclassification exercise modeled by DLM, the likelihood of surveyors making the seven simultaneous errors DLM state is very likely is in fact, under generous assumptions, actually less than 0.35%— i.e. extraordinarily unlikely.

Overall, while DLM motivated an interesting exercise in which we show that our result is robust to classification of sites based on the number of carcasses found, they provided no valid critiques to our analytical approach or results. The central conclusion, that our results should be dismissed, was the result of a sequence of coding, inferential, and logical errors.

Evidence Should be Used in Global Management of Endangered Species: reply to the CITES Technical Advisory Group

In anticipation of the 17th meeting of the Conference of the Parties of the Convention on the International Trade in Endangered Species (CITES), the CITES Technical Advisory Group (TAG) on elephant poaching and ivory smuggling released a document (numbered CoP17 Inf. 42) in which they directly addressed a working paper released by Nitin Sekar and myself.

The document did not introduce new substantive arguments (see end of post for one exception regarding dates)—in fact, the document’s claims were nearly identical to those made in Fiona Underwood’s prior blog posts about our analysis (here and here). In her second post, Dr. Underwood (one of a handful of members of the TAG) openly states that the TAG asked her to evaluate our analysis, and it seems they have simply adopted her view as the official stance. These arguments provide the basis for the TAG's conclusion:
13) The claims in the working paper by Hsiang and Sekar are fundamentally flawed, both in logic and methodology. The MIKE and ETIS TAG is therefore of the view that the study should not be used to inform CITES policy on elephants.
Given that the arguments have not changed, we have previously responded to almost all the document’s claims in our two prior blogposts, here and here. Below we summarize the relevant points and address new points.

As shown below,
  • numerous statements about our analysis (either in the manuscript or follow-up analysis in these prior posts) made by the CITES white paper are factually incorrect;
  • in supporting its narrative, the white paper misreports results and conclusions of other studies;
  • the white paper makes erroneous statistical arguments;
  • when the approach advocated for in the white paper is correctly implemented, it provides results virtually identical to the original findings in our analysis.
Thus, overall, there is no evidence to support the claims in the CITES white paper and therefore no reason for CITES to refuse to consider these results when developing policy.

Tuesday, September 20, 2016

Normality, Aggregation & Weighting: Response to Underwood’s second critique of our elephant poaching analysis

Nitin Sekar and I recently released a paper where we tried to understand if a 2008 legal ivory sale coordinated by CITES and designed to undercut black markets for ivory increased or decreased elephant poaching. Our results suggested the sale abruptly increased poaching:


Dr. Fiona Underwood, a consultant hired by CITES and serving on the Technical Advisory Group of CITES' MIKE and ETIS programs, had previously analyzed the same data in an earlier paper with coauthors and arrived at different conclusions. Dr. Underwood posted criticisms of our analysis on her blog.  We replied to every point raised by Dr. Underwood and posted our replication code (including an Excel replication) here. We also demonstrated that when we analyzed the data published alongside Dr. Underwood's 2011 study, in an effort to reconcile our findings, we essentially recovered our main result that the sale appeared to increase poaching rates. Dr. Underwood has responded to our post with a second post. This post responds to each point raised in Dr. Underwood's most recent critique.

Dr. Underwood's latest post and associated PDF file makes three arguments/criticisms:

1) Dr. Underwood repeats the criticism that our analysis is inappropriate for poaching data because it assumes normality in residuals.  Dr. Underwood's intuition is that the data do not have this structure, so we should instead rely on generalized linear models she proposes that assume alternative error structures (as she did in her 2011 PlosOne article). Dr. Underwood's preferred approach assumes the number of elephants that die at each site in each year is deterministic, and the fraction that are poached at each site-year (after the total number marked to die is determined) is determined by a weighted coin toss where PIKE reflects the weighting (we discuss our concerns with this model in our last post).

2) Dr. Underwood additionally and specifically advocates for the evaluation of Aggregate PIKE to deal with variation in surveillance and elephant populations (as has been done in many previous reports) rather than average PIKE, as we do. Dr. Underwood argues that this measure better accounts for variation in natural elephant mortality.

3) To account for variation in the number of elephants discovered, Dr. Underwood indicates that we should be running a weighted regression where the weight of each site-year is determined by total mortality, equal to the sum of natural and illegal carcasses discovered in each site-year.

We respond to each of these points individually below. We then point out that these three criticisms are themselves not consistent with one another.  Finally, we note what this debate demonstrates the importance of having multiple approaches to analyzing a data set as critical as PIKE.

Monday, September 12, 2016

Everything we know about the effect of climate on humanity (or "How to tell your parents what you've been doing for the last decade")

Almost exactly ten years, Nick Stern released the his famous global analysis on the economics of climate change.  At the time, I was in my first year of grad school trying to figure out what to do research on and remember fiercely debating all aspects of the study with Ram and Jesse in the way only fresh grad students can.  Almost all the public discussion of the analysis revolved around whether Stern had chosen the correct discount rate, but that philosophical question didn’t seem terribly tractable to us.  We decided instead that the key question research should focus on was the nature of economic damages from climate change, since that was equally poorly known but nobody seemed to really be paying attention to it.  I remember studying this page of the report for days (literally, days) and being baffled that nobody else was concerned about the fact that we knew almost no facts about the actual economic impact of climate—and that it was this core relationship that drove the entire optimal climate management enterprise:

click to enlarge

So we decided to follow in the footsteps of our Sensei Schlenker and to try and figure out effects of climate on different aspects of the global economy using real world data and rigorous econometrics. Together with the other G-FEEDers and a bunch of other folks from around the world, we set out to try and figure out what the climate has been doing and will do to people around the planet. (Regular readers know this.)

Friday, Tamma Carleton and I published a paper in Science trying to bring this last decade of work all together in one place. We have learned a lot, both methodologically and substantively. There is still a massive amount of work to do, but it seemed like a good idea to try and consolidate and synthesize what we’ve learned at least once a decade…

Here’s one of the figures showing some of the things we’ve learned from data across different contexts, it’s kind of like a 2.0 version of the page from Stern above:


click to enlarge

Bringing all of this material together into one place led to a few insights. First, there are pretty clear patterns across sectors where adaptation appears to either be very successful (e.g. heat related mortality or direct losses from cyclones) or surprisingly absent (e.g. temperature losses for maize, GDP-productivity losses to heat). In the latter case, there seem to be “adaptation gaps” that are persistent across time and locations, something that we might not expect if adaptation is costless in the long run (as many people seem to think). We can’t say exactly what’s going on that’s causing these adaptation gaps to persist, for example, it might be that all actors are behaving optimally and this is simply the best we can do with current technology and institutions, or alternatively there might be market failures (such as credit constraints) or other disincentives (like subsidized crop insurance) that prevent individuals from adapting. Figuring out (i) whether current adaptation is efficient, or (ii) if it isn’t, what’s wrong so we can fix it, is a multi-trillion-dollar question and the area where we argue researchers should focus attention.

Eliminating adaptation gaps will have a big payoff today and in the future. To show this, we compute the total economic burden borne by societies today because they are not perfectly adapted today. Even before one accounts for climate change, our baseline climate appears to be a major factor determining human wellbeing around the world.

For example, we compute that on average the current climate

- depresses US maize yields by 48%
- increases US mortality rates 11%
- increases US energy use by 29%
- increases US sexual assault rates 6%
- increases the incidence of civil conflict 29% in Sub-Saharan Africa
- slows global economic growth rates 0.25 percentage points annually

These are all computed by estimating a counterfactual where climate conditions at each location are whatever historically observed values at that location are most ideal.

Our first reaction to some of these numbers were that they were too big. But then we reflected on the numbers more and realized maybe they are pretty reasonable. If we could grow all of US maize in greenhouses where we control the temperature, would our yields really be 48% higher? That’s not actually too crazy if you think about cases where we have insulated living organisms much more from their environment and they do a whole lot better because of it. For example, life expectancy for people has more than double in the last few centuries as we started to protect ourselves from all the little health insults that use to plague people. For similar reasons, if you just look at pet cats in the US, indoor cats live about twice as long as more exposed outdoor cats on average (well, at least according to Petco and our vet). Similarly, lot of little climate insults, each seemingly small, can add up to generate substantial costs—and they apparently do already.

We then compared these numbers (on the current effect of the current climate) to (i) the effects of climate change that has occurred already (a la David and Wolfram) and (ii) projected effects of climate change.

When it comes to effects of climate change to date, these numbers are mostly small. The one exception is that warming should have already increased the integrated incidence of civil conflict  since 1980 by >11%.

When it comes to future climate change, that we haven’t experienced yet, the numbers are generally large and similar-ish in magnitude to the current effect of the current climate. For example, we calculate that future climate change should slow global growth by an additional 0.28 percentage points per year, which is pretty close in magnitude to the 0.25 percentage points per year that temperatures are already slowing things down. For energy demand in the US, current temperatures area actually doing more work today (+29%) than the additional effect of future warming (+11%), whereas for war in Sub-Saharan Africa, current temperatures are doing less (+29%) than near term warming (+54%).

All these numbers are organized in a big table in the paper, since I always love a big table. There's also a bit of history and summary of methods in there as well, for those of you who, like Marshall, don't want to bother slogging through the sister article detailing all the methods.

Tuesday, August 23, 2016

Risk Aversion in Science

Marshall posted last week about our poverty mapping paper in Science. One thing a reporter asked me the other day was about the origins of the project, and it got me thinking about an issue that’s been on my mind for a while – how does innovation in science happen, and how sub-optimal is the current system for funding science?

First the brief back story: Stefano started as an assistant professor in Computer Science a couple years back, and reached out to me to introduce himself and discuss potential areas of mutual interest. I had recently been talking with Marshall about the GiveDirectly approach to finding poor people, and we had been wondering if there was (a) a better (automated) way to find thatched vs. metal roofs in satellites and (b) some way to verify if roof type is actually a decent predictor of poverty. So we chatted about this once or twice, and then some students working with Stefano tried to tackle the roof problem. That didn’t work too well, but then someone (Stefano, I think, but I can’t really recall) suggested we try to train a neural network to predict poverty directly without focusing on roofs. Then we talked about how poverty data was pretty scarce, how maybe we could use something like night lights instead, and yada yada yada, now we have the paper.  

Now the more general issue. I think most people outside of science, and even many people starting out as scientists, perceive the process as something like this: you apply for funding, you get it, you or a student in your group does the work, you publish it. That sounds reasonable, but in my experience that hasn’t been how it works for the papers I’m most excited about. The example of this paper is more like: you meet someone and chat about an idea, you do some initial work that fails, you keep talking and trying, you apply for funding and get rejected (in this case, twice by NSF and a few times by foundations), you keep doing the work anyway because you think it’s worthwhile, and eventually you make a breakthrough and publish a paper.

In this telling, the progress happens despite the funding incentives, not because of them. Luckily at Stanford we have pretty generous start-up packages and internal funding opportunities that enable higher risk research, and we are encouraged by our departments and the general Silicon Valley culture to take risks. And we have unbelievably good students, many of whom are partially or fully funded or very cheap (a key contributor on the Science paper was an undergrad!). But that only slightly lessens the frustration of having proposals to federal agencies being rejected because (I’m paraphrasing the last 10+ proposal reviews I’ve gotten) “it would be cool if it worked but I don’t think it’s likely to work.” If I wasn’t at Stanford, I probably would have long ago stopped submitting risky ideas, or gotten out of academia altogether.

I know this is a frustration shared by many colleagues, and also that there’s been a fair number of academic studies on incentives and innovation in science. One of the most interesting studies I’ve read is this one, about the effects of receiving a Howard Hughes Medical Institute (HHMI) investigator grant on creativity and productivity. The study isn’t all that new, but definitely a worthwhile read. For those not familiar with the HHMI, it is a fairly substantial amount of funding given to a person, rather than for a specific project, with a longer time horizon than most awards. It’s specifically designed to foster risk taking and transformational work.

The article finds a very large effect of getting a HHMI on productivity, particularly in output of “top hit” publications. Interestingly, it also finds an increase in “flops”, meaning papers that get cited much less than typical for the investigator (controlling for their pre-award performance). This is consistent with the idea that the awardees are taking more risks, with both more home runs and more strike outs. Also consistent is the fact that productivity drops in the few years after getting an award, presumably because people start to pursue new directions. Even more interesting to me was the effect of getting an HHMI on applications to NIH. First, the number of applications goes way down, presumably because recipients spend less time seeking funds and more time actually doing science. Second, the average ratings for their proposals gets worse (!) consistent with the idea that federal funds are biased against risky ideas.

Unfortunately, there aren’t any studies I can find on the “people not project” types of awards in other areas of science. Personally, I know my NASA new investigator program award was instrumental in freeing me up to explore ideas as a young faculty. I never received an NSF Career award (rejected 3 times because – you guessed it – the reviewers weren’t convinced the idea would work), but that would be a similar type of thing. I’d like to see a lot more empirical work in this area. There’s some work on awards, like in this paper, but awards generally bring attention and prestige, not actual research funds, and they apply to a fairly small fraction of scientists.

I’d also like to see some experiments set up, where people are randomly given biggish grants (i.e. enough to support multiple students for multiple years) and then tracked over time. Then we can test a few hypotheses I have, such as:
  1. Scientists spend way too much time writing and reviewing proposals. An optimal system would limit all proposals to five pages, and give money in larger chunks to promote bigger ideas. 
  2. There is little or maybe even zero need to include feasibility as a criteria in evaluating proposals for specific projects. More emphasis should be placed on whether the project will have a big positive impact if it succeeds. Scientists already have enough incentive to make sure they don’t pursue dead ends for too long, since their work will not get published. Trying to eliminate failure, or even trying hard to reduce failure rates, based on a panel of experts is counterproductive. (It may be true that panel ratings are predictive of future project impact but I think that comes from identifying high potential impact rather than correctly predicting the chance of failure)
  3. People who receive HHMI-like grants are more likely to ponder and then pursue bigger and riskier ideas. This will result in more failure and more big successes, with an average return that is much higher than a lot of little successes. (For me, getting the Macarthur award was, more than anything, a challenge to think about bigger goals. I try to explain this to Marshall when he constantly reminds me that people’s productivity decline after getting awards. I also don’t think he’s read the paper to know it’s only a temporary decline. Temporary!)
  4. Aversion to risk and failure is especially high for people who do not have experience as researchers, and thus don’t appreciate the need to fail on the way to innovation. One prediction here is that panels or program managers with more successful research histories will tend to pick more high impact projects.

I’m sure some of the above are wrong, but I’m not sure which ones. If anyone has answers, please let me know. It’s an area I’m mostly ignorant on but interested to learn more. I’d apply for some funding to study it, but it’d probably be rejected. I’d rather waste my time blogging than writing more proposals.



One final thought. On several occasions I have been asked by foundations or other donors what would be a good “niche” investment in topics around sustainability. I think they often want to know what specific topics, or what combination of disciplines, are most ripe for more funding. But my answer is typically that I don’t know enough about every topic possible to pick winners. Better to do something like HHMI for our field, i.e. encourage big thinking and risk taking among people that have good track records or indicators of promise. But that requires a tolerance for failure, and even foundations in the midst of Silicon Valley seem to struggle with that.

Thursday, August 18, 2016

Economics from space

We've got a paper out in Science today that demonstrates a new way to use satellite imagery to predict economic well-being in poor countries (see project website here).  The paper is a collaboration between some of us social scientists (or social "scientists", with emphatic air quotes, as my wife puts it) and some computer scientists across campus -- folks who have apparently figured out how to use computers for more than email and Youtube surfing.

We're hoping that this is the first of many projects with these guys, and so have codified our collaboration here, with one of those currently-popular dark-hued website designs where you scroll around a lot.

So why is it sensible to try to use satellite imagery to predict economic livelihoods?  The main motivation is the lack of good economic data in many parts of the developing world.  As best we can tell, between the years 2000 and 2010, one quarter of African countries did not conduct a survey from which nationally-representative poverty estimates could be constructed, and another 40% conducted only one survey.  So this means that in two-thirds of countries on the world's poorest continent, you've got very little sense of what's going on, poverty-wise.  And even a lot of the surveys that do get conducted are only partially in the public domain, meaning you've got to employ some trickery to even get the shape of the income distribution in these countries (and survey locations are still unavailable!).

This lack of data makes it hard to track specific targets that we've set, such as the #1 Sustainable Development Goal of eliminating poverty by 2030.  It also makes it hard to evaluate whether specific interventions aimed at reducing poverty are actually working.  The result is that we currently have little rigorous evidence about the vast majority of anti-poverty interventions undertaken in the developing world, and no real way to track progress towards SDGs or any other target.

While we don't collect a lot of survey data for many locations in the developing world, we collect other sources of information about these places constantly -- satellite information being one obvious source.  So our goal in this paper was to see whether we could use recent hi-res recent imagery to predict economic outcomes at a local level, and fill in the gaps between what we know from surveys.

We are certainly not the first people to think of using satellites or other "unconventional" data sources to study economic output in the developing world.  For instance, here is a 2012 paper by Adam Storeygard that uses nightlights to improve GDP estimates at the country level, and here is a paper from about 9 months ago by Josh Blumenstock and company where they use call data records from a cell phone company to predict local-level economic outcomes in Rwanda.  But what our approach brings to the table is that (unlike Storeygard et al) we can make very local predictions, and that (perhaps unlike Blumenstock et al) our approach is very easy to scale, given that satellite imagery are available free or at very low cost for every corner of the earth and more rolls in each day.

For a quick explanation of what we do in the paper, check out this short video that we made in collaboration with these guys.  Sort of an experiment on our end, comments or slander welcome in the comments section.



The main innovation in the paper is in figuring out what information in the hi-res daytime imagery might be useful for predicting poverty or well-being.  Standard computer vision approaches to interpreting imagery typically get fed huge training datasets - e.g. millions of "labeled" images (e.g. "dog" vs "cat") that a given model can use to learn to distinguish the two objects in an image.  But the whole problem here is that we have very little training data -- i.e. few places where we can tell a computer with certainty that a specific location is rich or poor.

So take a two-step approach to solving this problem.  First, we use lower-resolution nightlights images to train a deep learning model to identify features in the higher-resolution daytime imagery are predictive of economic activity. The idea here -- building on the paper cited above -- is that nightlights are a good but imperfect measure of economic activity, and they are available for everywhere on earth. So the nightlights help the model figure out what features in the daytime imagery are predictive of economic activity.  Without being told what to look for, the model is able to identify a number of features in the daytime imagery that look like things we recognize and tend to think are important in economic activity (e.g roads, urban areas, farmland, and waterways -- see Fig 2 in our paper).

Then in the last step of the process, we use these features in the daytime imagery to predict village-level wealth, as measured in a few household surveys that were publicly available and geo-referenced.  (As our survey measures we use data from the World Bank LSMS for consumption expenditure and from the DHS for assets.)  We call this two step approach "transfer learning", in the sense that we've transferred knowledge learned in the nightlights-prediction task to the final task of predicting village poverty.  Nightlights are not used in the final poverty prediction; they are only used in the first step to help us figure out what to use in the daytime imagery.

Josh Blumenstock (or some Science art editor) have a really nice depiction of the procedure, in a commentary that Josh wrote on our piece that also appeared today in Science.

The model does surprisingly well.  Below are cross-validated model predictions and R-squareds for consumption and assets, where we are comparing model predictions against survey measurements at the village level in five African countries (Uganda, Tanzania, Malawi, Nigeria, Rwanda).  The cross-validation part is key here -- basically we split the data in two, train the model on one part of the data, and then predict for the other part of the data that the model hasn't seen.  This guards against overfitting.



We can then use these predictions to make poverty maps of these countries.  Here is a prototype (something we're still working on), with estimates aggregated to the district level:

[Edit:  Tim Varga pointed out in an email that, while beautiful, the below plot is basically meaningless to the 10% of men and 1% of women who are red/green colorblind.  Duh - and sorry!  (Only silver lining is that this mistake harmed men differentially, subverting the normal gender bias).  Nevertheless, we will fix..]


Maybe the most exciting result is that a model trained in one country appears to do pretty well when applied outside that country, at least within our set of 5 countries.  For example, a model trained in Uganda does a pretty good job of predicting outcomes in Tanzania, without ever having seen Tanzanian data.  Granted, this would likely work a lot worse if we were trying to make predictions for a more dissimilar country (say, Iceland).  But it suggests that at least within Africa -- the continent where data gaps remain largest -- our approach could have wide application.

Finally, we don't really view our approach as a substitute for continuing to do households surveys, but rather as a strong compliment -- as a way to dramatically amplify the power of what we learn from these surveys.  It's likely that we're going to continue to learn a lot from household surveys that we might never learn from satellite imagery, even with the fanciest machine learning tricks.

We are currently trying to extend this work in multiple directions, including evaluating whether we can make predictions over time using lower-res Landsat data, and in scaling up the existing approach to all of Africa.  More results coming soon, hopefully.  We also want to work with folks who can use these data, so if that happens to be you, please get in touch!  

Friday, August 5, 2016

A midsummer’s cross-section

Summer is going too fast. It seems like just yesterday Lebron James was being a sore loser, body slamming and stepping over people – and getting rewarded for it by the NBA. Apart from that, one interesting experience this summer was getting to visit some very different maize (corn) fields within a few weeks in July. First, I was in Kenya and Uganda at some field sites, and then I was visiting some farms in Iowa.

When talking maize, it’s hard to get much different than East Africa and East Iowa. As a national average, Kenya produces a bit less than 4 million tons of maize on 2 million ha, for a yield of about 1.75 t/ha. Iowa has about seven times higher yield (12.5 t/ha), and produces nearly twenty times more maize grain. The pictures below give a sense of a typical field in each place (Kenya on the left).


Lots of things are obviously different between the two areas. There are also some things that people might think are different but really aren’t. For example, looking at annual rainfall or summer temperatures, they are pretty similar for the two areas (figures from www.climatemps.com, note different scales):



But there are also things that are less obviously different. Earlier this year I read this interesting report trying to estimate soil water holding capacity in Africa, and I’ve also been working a bunch with soil datasets in the U.S. from the USDA. Below shows the total capacity of the soil to store water in the root zone (in mm) for the two areas, plotted on the same scale. 

It’s common for people to talk about the “deep” soils of the Corn Belt, but I don’t think people typically realize just how much better they are at storing water than many other places. There’s virtually no overlap between the distribution of root zone storage in the two areas, and on average Kenya soils have about half the capacity of Iowa’s.

How much difference can this one factor make? As a quick thought experiment I ran some APSIM-maize simulations for a typical management and weather setup in Iowa, varying only the root zone storage capacity between 150 and 330mm. Simulated yields by year are shown below, with dashed lines showing the mean for each soil. 


This suggests that having half the storage capacity translates to roughly half the average yields, with much bigger relative drops in hot or dry years like 1988 or 2012. And this assumes that management is identical, when a rational farmer would surely respond by applying much less inputs to the worse soils.


Just something to keep in mind when thinking about the potential for productivity growth in Africa. There’s certainly room for growth, and I saw a lot of promising trends. But just like when it comes to the NBA officials, there's a lot going on under the surface, and I wouldn’t expect too much. 

Wednesday, August 3, 2016

Applying econometrics to elephant poaching: our response to Underwood and Burn

Summary


[warning: this is my longest post ever...]

Nitin Sekar and I recently released a paper examining whether a large legal sale of ivory affected poaching rates of elephants around the world. Our analysis indicated that the sale very likely had the opposite effect from its original intent, with poaching rates globally increasing abruptly instead of declining. Understandably, the community of policy-engaged researchers and conservationists has received these findings with healthy skepticism, particularly since prior studies had failed to detect a signal from the one-time sale. While we have mostly received clarifying questions, the critique by Dr. Fiona Underwood and Dr. Robert Burn fundamentally questions our approach and analysis, in part because their own analysis of PIKE data yielded such different results. 

Here, we address their main concerns. We begin by demonstrating that, contrary to our critics’ claims, the discontinuity in poaching rates from 2008 onwards (as measured by PIKE) is fairly visible in the raw data and made clearer using simple, valid statistical techniques—our main results are not derived from some convoluted “model output,” as suggested by Dr. Underwood and Dr. Burn (we developed an Excel spreadsheet replicating our main analysis for folks unfamiliar with statistical programming). We explain how our use of fixed effects accounts for ALL average differences between sites, both those differences for which we have data and those for which we are missing data, as well as for any potential biases from the uneven data reporting by sites—and we explain why this is better than approaches that attempt to guess what covariates were responsible for differences in poaching levels across different sites. We show that our findings are robust to the non-linear approaches recommended by Dr. Underwood and Dr. Burn (as we already had in large part in our original paper) and that similar discontinuities are not present for other poached species or Chinese demand for other precious materials (two falsification tests proposed by Underwood and Burn). We also show that previous analyses that failed to notice the discontinuity may have in part done so because they smoothed the PIKE data. 

We then discuss Dr. Underwood and Dr. Burn’s concerns about our causal inference. While we are more sympathetic to their concerns here, we a) review the notable lengths to which we went to look for reasonable alternative hypotheses for the increase in poaching; b) examine some of Dr. Underwood and Dr. Burn’s specific alternative hypotheses; c) present an argument for inferring causality in this context; and d) document that trend analyses less complete than ours have been used by CITES to infer that the one-time sale had no effect on poaching in the past, suggesting that our paper presents at least as valuable a contribution to the policy process as these prior analyses. We then try to understand why the prior analysis of PIKE by Burn et al. 2011 failed to detect the discontinuity that we uncovered in our study. 

Finally, we conclude by discussing how greater data and analysis transparency in conservation science would make resolving debates such as this one easier in the future. We also invite participation by researchers to a webinar where we will field further questions about this analysis, hopefully clarifying remaining concerns. 

Overall, while our analysis is by no means the last word on how legal trade in ivory affects elephant poaching, we assert that our approach and analysis are valid, and that our transparency makes possible fully understanding of the strengths and limitations of our research.

Monday, June 27, 2016

Conflict in a changing climate: Adaptation, Projection, and Adaptive Projections (Guest post by Tamma Carleton)

Thanks in large part to the authors of G-FEED, our knowledge of the link between climatic variables and rates of crime and conflict is extensive (e.g. see here and here). The wide-ranging effects of high temperatures on both interpersonal crimes and large-scale intergroup conflict have been carefully documented, and we’ve seen that precipitation shortfalls or surpluses can upend social stability in locations heavily dependent on agriculture. Variation in the El Niño cycle accounts for a significant share of global civil conflict, and typhoons lead to higher risks of property crime.

Much of the research in this area is motivated by the threat of anthropogenic climate change and the desire to incorporate empirically derived social damages of climate into policy tools, like integrated assessment models. Papers either explicitly make projections of conflict under various warming scenarios (e.g. Burke et al., 2009), or simply refer to future climate change as an impetus for study. While it’s valuable from a risk management perspective to understand and predict the effects that short-run climate variation may have on crime and conflict outcomes today, magnitudes of predicted climate changes and the lack of clear progress in climate policy generally place future projections at the heart of this body of work.

However, projections are also where much of the criticism of climate-conflict research lies. For example, see Frequently Heard Criticisms #1 and #6 in this post by Marshall. These criticisms are often reasonable, and important (unlike some other critiques you can read about in that same post). The degree to which impacts identified off of short-run climate variation can effectively predict future effects of long-run gradual climate change involves a lot of inherent (statistical and climatological) uncertainty and depends critically on likely rates of adaptation. Because the literature on adaptation is minimal and relies mostly on cross-sectional comparisons, we are limited in our ability to integrate findings into climate policy, which is often the motive for conducting research in the first place. 

Recently, Sol, Marshall and I published (yet another) review of the climate and conflict literature, this time with an emphasis on adaptation and projection, to try to address the concerns discussed above. I’m going to skip the content you all know from their detailed and impressive previous reviews, and talk about the new points we chose to focus on.

Thursday, June 9, 2016

The Planetary Geometry of Extreme Climate Migration


Adam Sobel and I have a new paper on Potentially Extreme Population Displacement and Concentration in the Tropics Under Non-Extreme Warming.

The paper is a theoretical exercise making a simple point: tropical populations that migrate to maintain their temperature will generally have to move much further that their counterparts in middle latitudes. This results from well established atmospheric physics that aren't focused on by the researchers who usually think about climate change and migration. One reason this result is interesting is that, in a sense, the physics of the tropical atmosphere "amplify" the impact of warming (at least in this dimension) so that even small climate changes can produce really large population displacements in the tropics (theoretically). Here's the story...