Thursday, May 29, 2008

When DOIs collide and then disappear: when is a unique, resolvable identifier a bad idea?


As much as I like the idea of a globally unique, resolvable identifier, my recent experience with JSTOR is making me wonder.

JSTOR has three identifiers for articles it archives, DOIs, SICIs, and stable URLs (the later being introduced with the new platform released April 4, 2008). Previously JSTOR would publish DOIs for many of its articles. However, not all of these work, and many are now embedded in the HTML (say, in Dublin Core meta elements) but not publicly displayed.

I suspect the issue is the moving wall:
Journals in JSTOR have "moving walls" that define the time lag between the most current issue published and the content available in JSTOR. The majority of journals in the archive have moving walls of between 3 and 5 years, but publishers may elect walls anywhere from zero to 10 years.
Now, imagine that a publisher has an article on its web site, complete with a DOI, and that article is then add to JSTOR, but is still displayed on the publisher's site.


To make this concrete, consider the article by Baum et al. . On the InformaWorld site this is displayed with doi:10.1080/106351598260879. The same article is also in JSTOR, with the URL http://www.jstor.org/pss/2585367. No DOI is displayed on the page, but if you look at the HTML source, we find:
<meta name="dc.Identifier" scheme="doi" content="10.2307/2585367">. The DOI prefix 10.2307 is used for all JSTOR DOIs, and some for Systematic Biology still work, e.g. 10.2307/2413524.

Now, what happens when the JSTOR moving wall overlaps with publisher's material? What happens if a publisher digitises back issues, then assigns them DOIs? Do the JSOR DOIs then die (as some of them seem to have already done)? And what happens to the poor sap like me, who has been linking to JSTOR DOIs in the naive belief that DOIs don't die?

Suddenly separating identity from resolution is starting to look very attractive...

No comments: