Friday, May 08, 2015

Putting some bite into the Bouchout Declaration

There are no requirements for signing up. A signature is first and foremost a statement of support for open data . Each signatory can determine how best to make progress towards the goal. Some recommendations are included in the declaration. We hope that signatories will become early adopters of the open access approach, that they will promote change in their institutions, societies and journals, and will position themselves and their institutions as leaders. (from http://www.bouchoutdeclaration.org/faqs/)
M13QFZi4I've put off writing this post about the Bouchout Declaration for a number of reasons. I attended the meeting that launched the declaration last year, and from my perspective that was a frustrating meeting. Much talk about "Open Biodiversity Knowledge Management" with nobody seemingly willing or able to define it (see The vision thing - it's all about the links for some comments I made before attending the meeting), and as much as the signing of the Boechout Declaration provided good theatre, it struck me as essentially an empty gesture. Public pronouncements are all well and good, but are ultimately of little value unless backed up by action. We have institutions that have signed the declaration yet have much of their intellectual output locked behind paywalls (e.g., JSTOR Global Plants). So much for being open.

So, since Donat challenged me, here's what I'd like to see happen. I'd like to see metrics of "openness" that we can use to evaluate just how open the signatories actually are. These metrics could be viewed as ways to try and persuade institutions into sharing data and other information, as a league table we can use to apply pressure, or as a way to survey the field and see what the impediments are to being open (are they financial, legal, cultural, resource, etc.).

Below are some of the things I think we could "score" the openness of biodiversity institutions.

Is the collection digitised and in GBIF?

Simple criterion that is easy to measure. If an institution has specimens or other biological material, is data and or metadata on the collection freely available? What fraction of the collection has been digitised? How good is that digitsation (e.g., what fraction has been georeferenced?). We could define digitisation more broadly to include imaging and sequencing (both are methods of converting analogue specimens into digital objects).

Are the institutional publications digitised? Are they open access?

Some institutions have a history of digitising their in-house publications and making them freely available online (e.g., the AMNH), some even make them fully citable with CrossRef DOIs (e.g., the Australian Museum). But some institutions have, sadly, signed over their publications to commercial publishers or archives that charge for access (e.g., Kew's publications have been digitised by JSTOR, which limits their accessibility). As a foot note, I suspect that those institutions that lost confidence in their in-house publishing operations and outsourced them are the ones who have ended up loosing control of their intellectual output, some of which is now closed off (e.g., some of the NHM London's journals are now the property of Cambridge University Press). Those institutions that maintained a culture of in-house publishing are the ones at the vanguard of digitising and opening up those publications.

Does the institution take part on the Biodiversity Heritage Library?

There are at least two ways to participate in the Biodiversity Heritage Library (BHL), one is by becoming a member and start scanning books from institutional libraries. The other is by granting permission to BHL to scan institutional publications. BHL is often viewed as an archive of "old" literature, but in fact it has some very recent content. Some farsighted organisations have let BHL scan their journals, contributing to BHL becoming an indispensable resource for biodiversity research.

Do institution staff publish in open access journals?

A while ago I complained about how few new species descriptions were in open access journals (The top-ten new species described in 2010 and the failure of taxonomy to embrace Open Access publication). A measure of openness is whether an institution encourages its staff to publish their work in open access journals, and to make their data freely available as well. Some prefer to chase Nature and Science papers, but I'd like to think we could prioritise openness over journal impact factor.

These are just some of the more obvious things that could be used to measure openness. At the same time, it would be useful to develop ways to show the benefits of being open. For example, I've long argued that we could develop citation tracking for specimens. This gives researchers a means to track provenance of information (who said what about the identity of a specimen), and it also gives institutions a way to measure the impact of their collections. Doing this at scale is only going to be possible if collections are digitised, specimens have identifiers of some sort, and we can text mine the literature and associated data for those identifiers (in other words, the data and publications need to be open). So, perhaps on way to help make the case for being open is to develop metrics that are useful for the institutions themselves.

I guess I would have been much more enthusiastic about the Bouchout Declaration if these sort of things had been in place at the start. Anyone can sign a document. Ideas are cheap, execution is everything.