Friday, September 30, 2011

Taylor and Francis Online breaks DOIs - lots of DOIs

TandFOnline twitterDOIs are meant to be the gold standard in bibliographic identifier for article. They are not supposed to break. Yet some publishers seem to struggle to get them to work. In the past I've grumbled about BioOne, Wiley, and others as cuplrits with broken or duplicate or disappearing DOIs.

Today's source of frustration is Taylor and Francis Online. T&F Online is powered by (Atypon), which recently issued this glowing press release:

SANTA CLARA, Calif.—20 September 2011—Atypon®, a leading provider of software to the professional and scholarly publishing industry, today announced that its Literatum™ software is powering the new Taylor & Francis Online platform (www.TandFOnline.com). Taylor & Francis Online hosts 1.7 million articles.
...
"The performance of Taylor & Francis Online has been excellent," said Matthew Jay, Chief Technology Officer for the Taylor & Francis Group. "Atypon has proven that it can deliver on schedule and achieve tremendous scale. We're thrilled to expand the scope of our relationship to include new products and developments."

Great, except that lots of T&F DOIs are broken. I've come across two kinds of fail.

DOI resolves to server that doesn't exist
The first is where a DOI resolves to a phantom web address. For example, the DOI doi:10.1080/00288300809509849 resolves to http://tandfprod.literatumonline.com/doi/abs/10.1080/00288300809509849. But the domain tandfprod.literatumonline.com doesn't exist, so the DOI is a dead end.

DOI doesn't resolve
Taylor and Francis have digitised the complete Annals and Magazine of Natural History, a massive journal comprising nearly 20,000 articles from 1841 to 1966, and which has published some seminal papers, including A. R. Wallace's "On the law which has regulated the introduction of new species" doi"10.1080/037454809495509 which forced Darwin's hand (see the Wikipedia page for the successor journal Journal of Natural History. Taylor and Francis are to be congratulated for putting such a great resource online.

Problem is, I've not found a single DOI for any article in Annals and Magazine of Natural History that actually works. If you try and resolve the DOI for Wallace's paper, doi"10.1080/037454809495509, you get the dreaded "Error - DOI not found" web page. So something like 20,000 DOIs simply don't work. The only way to make the DOI work is append it to "http://www.tandfonline.com/doi/abs/", e.g. http://www.tandfonline.com/doi/abs/10.1080/037454809495509. This gets us to the article, but rather defeats the purpose of DOIs.

Why?
Something is seriously wrong with CrossRef's quality control. It can't be too hard to screen all domains to see if they actually exist (this would catch the first error). It can't be too hard to take a random sample of DOIs and check that they work, or automatically check DOIs that are reported as missing. In the case the Annals and Magazine of Natural History the web page for the Wallace article states that it has been available online since 16 December 2009. That's a long time for a DOI to be dead.

There is a wealth of great content that is being made hard to find by some pretty basic screw ups. So CrossRef, Atypon and Taylor and Francis, can we please sort this out?