Wednesday, September 17, 2014

An open letter to you climate people

Dear Climate People (yes, I mean you IPCC WG1 types):

I am a lowly social scientist. An economist to be precise. I am the type of person who is greatly interested in projecting impacts of climate change on human and natural systems. My friends and I are pretty darn good at figuring out how human and natural systems responded to observed changes in weather and climate. We use fancy statistics, spend tons of time and effort collecting good data on observed weather/climate and outcomes of interest. Crime? Got it. Yields for any crop you can think of? Got your back. Labor productivity? Please. Try harder.

But you know what's a huge pain in the neck for all of us? Trying to get climate model output in a format that is useable by someone without at least a computer science undergraduate degree. While you make a big deal out of having all of your climate model output in a public depository, we (yes the lowly social scientists) do not have the skills to read your terabytes and terabytes of netCDF files into our Macbooks and put them in a format we can use.

What do I mean by that? The vast majority of us use daily data of TMin, Tmax and Precipitation at the surface. That's it. We don't really care what's going on high in the sky. If we get fancy, we use wet bulb temperature and cloud cover. But that's really pushing it. For a current project I am trying to get county level Climate Model Output for the CMIP5 models. All of them. For all 3007 US counties. This should not be hard. But it is. My RA finally got the CMIP5 output from a Swiss server and translated them into a format we can use (yes. ASCII. laugh if you wish. The "a" in ASCII stands for awesome.) We now are slicing and dicing these data into the spatial units we can use. We had to buy a new computer and bang our heads against the wall for weeks.

If you want more people working on impacts in human and natural systems, we need to make climate model output available to them at the spatial and temporal level of resolution they need. For the old climate model output, there was such a tool, which was imperfect, but better than what we have now. I got a preview of the update to this tool, but it chokes on larger requests.

Here's what I'm thinking IPCC WG1: Let's create a data deposit, which makes climate model output available to WG2 types like me in formats I can understand. I bet you the impacts literature would grow much more rapidly. The most recent AR5 points out that the largest gaps in our understanding are in human systems. I am not surprised. If this human system has trouble getting the climate data into a useful format, I am worried about folks doing good work, who are even more computationally challenged than I am.

Call me some time. I am happy to help you figure out what we could do. It could be amazing!

Your friend (really) Max.

Monday, September 8, 2014

Can we measure being a good scientific citizen?

This is a bit trivial, but I was recently on travel, and I often ponder a couple of things when traveling. One is how to use my work time more efficiently. Or more specifically, what fraction of requests to say yes to, and which ones to choose? It’s a question I know a lot of other scientists ask themselves, and it’s a moving target as the number of requests change over time, for talks, reviews, etc.

The other thing is that I usually get a rare chance to sit and watch Sportscenter, and I'm continually amazed by how many statistics are now used to discuss sports. Like “so-and-so has a 56% completion percentage when rolling left on 2nd down” or “she’s won 42% of points on her second serve when playing at night on points that last less than 8 strokes, and when someone in the crowd sneezes after the 2nd stroke.” Ok, I might be exaggerating a little, but not by much.

So it gets me wondering why scientists haven’t been more pro-active in using numbers to measure our perpetual time management issues. Take reviews for journals as an example. It would seem fairly simple for journals to report how many reviews different people perform each year, even without revealing who reviewed which papers. I’m pretty sure this doesn’t exist, but could be wrong (The closest thing I’ve seen to this is Nature sends an email at the end of each year saying something like “thanks for your service to our journal family, you have reviewed 8 papers for us this year”).  It would seem that comparing the number of reviews to the number of papers you get reviewed by others (also something journals could easily report) would be a good measure of whether each person is doing their part.

Or more likely you’d want to share the load with your co-authors, but also account for the fact that a single paper usually requires about 3 reviewers. So we can make a simple “science citizen index” or “scindex” that would be
SCINDEX = A / (B x C/D) = A x D / (B x C)
where
A = # of reviews performed
B = # of your submissions that get reviewed (even if the paper ends up rejected)
C = average number of reviews needed per submission (assume = 3)
D = average number of authors per your submitted papers

Note that to keep it simple, none of this counts time spent as an editor of a journal. And it doesn’t adjust for being junior or senior, even though you could argue junior people should do less reviews and make up for it when they are senior. And I’m sure some would complain that measuring this will incentivize people agreeing but then doing lousy reviews. (Of course that never happens now). Anyhow, if this number is equal to 1 than you are pulling your own weight. If it’s more than 1 you are probably not rejecting enough requests. So now I’m curious how I stack up. Luckily I have a folder where I save all reviews and can look at the number saved in a given year. Let’s take 2013. Apparently I wrote 27 reviews, not counting proposal or assessment related reviews. And Google Scholar can quickly tell me how many papers I was an author on in that year (14), and I can calculate the average number of authors per paper (4.2). Let’s also assume that a few of those were first rejected after review elsewhere (I don’t remember precisely, but that’s an educated guess), so that my total submissions were 17. So that makes my scindex 27 x 4.2 / (17 x 3) = 2.2. For 2012 it was 3.3.

Holy cow! I’m doing 2-3 times as many reviews as I should be reasonably expected to. And here I was thinking I had a reasonably good balance of accepting/rejecting requests. It also means that there must be lots of people out there who are not pulling their weight (I’m looking at you, Sol).


It would be nice if the standard citation reports add something like a scindex to the h-index and other standards. Not because I expect scientists to be rewarded for being a good citizen, though that would be nice, or because it would expose the moochers. But because it would help us make more rational decisions about how much of the thankless tasks to take on. Or maybe my logic is completely off here. If so, let me know. I’ll blame it on being tired from too much travel and doing too many reviews!