Thursday, May 26, 2011

What is the best way to measure academic outputs that aren't publications?

My institute is going through various reviews of staff performance and, frankly, I'm feeling somewhat vulnerable given my somewhat unorthodox (at least amongst my colleagues) approach to doing science. I spend way more time writing code, building databases and web sites, and blogging than writing papers and getting grants (although I have been known to do both).

So the issue becomes, how to demonstrate that coding, building websites, and ranting on my blog is a worthwhile thing to do? Now, I'm happy that what I do has value, but my happiness isn't the issue. It's convincing people who want to see papers in high impact journals and bums on seats in labs that there's other ways to generate scientific output, and that output can have value. I'm also concerned that a simplistic view of what constitutes valid outputs will stifle innovation, just at the time when traditional science publishing is undergoing a revolution.

So, I posted a question on Quora:What is the best way to measure academic outputs that aren't publications?, where I wrote:
Usually we assess the quality of academic output using measures based on citations, either directly (how many papers have cited the paper?) or indirectly (is the paper published in a journal like Nature or Science that contains papers that on average get lots of citations, i.e. "impact factor"). But what of other outputs, such as web sites, databases, and software? These outputs often require considerable work, and can be widely used. What is the best way to measure those outputs?

There have been various approaches to measuring the impact of an article other than using citations, such as the number of article downloads, or the number of times an article has been bookmarked on a site such as Mendeley or CiteULike. But what of the coding, the database development, the web sites, and the blog posts. How can I show that these have value?

I guess there are two things here. One is the need to be able to compare across outputs, which is tricky (comparing citations across different disciplines is already hard), the other is the need to be able to compare within broadly similar outputs. Here are some quick thoughts:

Web sites
An obvious approach is to use Google Analytics to harvest information about page views and visitor numbers. The geographic origin of those visitors could be used to make a case for whether the research/data on that site is internationally relevant, although I suspect "internationally relevant" is a somewhat suspect notion. Most academic specialities are narrow, such that the person most interested in your research is likely living in a different country, hence by definition most research will be internationally "relevant".

The advantage of Google Analytics is that it is widely used, hence you could get comparative data and be able to show that your web site is more (or less) used that another site.

The value of code is tricky, but tools like ohloh provide estimates of the effort and expense required to generate code for a project. For example, for my bioGUID code repository (which includes code for bioGUID and BioStor, as well as some third party code) ohloh's estimated cost is 87 person-years and $US 4,784,203. OK, silly numbers, but at least I can compare these with other projects (Drupal, for example, represents 153 years and $US 8,438,417 of investment).

Comparing across output categories will be challenging, especially as there is no obvious equivalent for citation (one reason why if you develop software or a web site it makes good sense to write a paper describing it, worked for me). But perhaps download or article access statistics could provide a way to say "my web site is worth x publications. Note also that I'm not arguing that any of these measures is actually a good thing, just that if I'm going to be measured, and I have some say in how I'm measured, I'd like to suggest something sensible that others might actually buy.

So, please feel free to comment either here or on Quora . I need to put together some notes to make the case that people like me aren't just sitting drinking coffee, playing loud music, and tweeting without, you know, actually making stuff.