I think these papers represent one view of the future of scientific publishing ("article 2.0"), and I'm flattered that Penev et al. cite my Elsevier challenge work (doi:10.1016/j.websem.2010.03.004, preprint at hdl:10101/npre.2009.3173.1) as one of the sources of inspiration (along with the landmark Shotton et al. "Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article" doi:10.1371/journal.pcbi.1000361, which I've discussed previously). It is also good to see the TaxPub XML schema used by a publisher, and Scratchpads being a part of the process of publishing taxonomic information.
Deep linking
My initial impression is that there is huge of potential here, although I think there is still lots to do. I'm not totally convinced that popups are they way to go (although I've dabbled with them as well), and we need to move beyond simply linking to other sites to a deeper form of integration. For example, a Zookeys article might link to BHL via a taxonomic name, but how about deeper linking? For example, the paper by Brake and von Tschirnhaus (doi:10.3897/zookeys.50.505) contains the following citations:
Biró L (1899) Commensalismus bei Fliegen. Természetrajzi füzetek 22: 198–204.
Kertész K (1899) Verzeichnis einiger, von L. Biró in Neu-Guinea und am Malayischen Archipel gesammelten Dipteren. Természetrajzi füzetek 22: 173–19
Neither reference has any links in the HTML, so the user is under the impression that they aren't available online, but both references have been scanned by BHL. You can see full text for these articles in BioStor (references 52005 and 52004, respectively -- note that the pagination for Biró 1899 is given incorrectly in the paper). This is one area where BHL has a lot to offer publishers, and it would be great to see BHL provide the services publishers need to add these links to their articles.
This integration should go both ways. It's odd that the paper by Brake and von Tschirnhaus contains LSID used by the ZooBank for this paper (urn:lsid:zoobank.org:pub:DABB03F4-A128-43BB-990C-02F25D656B00, see the
<self-uri>
tag in the XML), but ZooBank doesn't know about the DOI for the paper, hence the ZooBank page for this article has no link to the article itself. It's time to join this stuff together.What's next?
What I'd really like to see is article XML repurposed as, say, RDF, and used to populate a database so that we can query it. In this way we can start to atomise the article into useful parts, and recombine them in new and interesting ways. Might be something to play with over the summer.
On a practical level, I'm somewhat bemused by the variety of XML formats being used by open access publishers. PLoS use version 2.0 of the NLM Journal Archiving and Interchange Tag Suite, and I wrote a XSLT style sheet to transform PLoS articles for viewing on an iPad. TaxPub is based on version 3.0 of the NLM DTD, which breaks quite a bit of my code relating to citations, so I'll have to tweak this to get it to display Zookeys articles correctly. Handling TaxPub itself will also require some additional work. Then there are the BMC journals, which have their own flavour of XML (based on something called the "KETON DTD"). It's all a bit messy. But I guess it'd be no fun if it was too easy...