Wednesday, April 10, 2019

Ozymandias: A biodiversity knowledge graph published in PeerJ

My paper "Ozymandias: A biodiversity knowledge graph" has been published in PeerJ https://doi.org/10.7717/peerj.6739
The paper describes my entry in GBIF's 2018 Ebbe Nielsen Challenge, which you can explore here. I tweeted about its publication yesterday, and got some interesting responses (and lots of retweets, thanks to everyone for those).

Carl Boettiger (@cboettig) asked where the triples were, as did Kingsley Uyi Idehen (@kidehen). Doh! This is one thing I should have done as part of the paper. I've uploaded the triples to Zenodo, you can find them here: https://doi.org/10.5281/zenodo.2634326.

Donat Agosti (@myrmoteras) complained that my knowledge graph ignored a lot of available information, which is true in the sense that I restricted it to a core of people, publications, taxa, and taxonomic names. The Plazi project that Donat champions extracts, where possible, lots of detail from individual publications, including figures, text blocks corresponding to taxonomic treatments, and in some cases geographic and specimen information. I have included some of this information in Ozymandias, specifically figures for papers where they are available. For example, Figure 10 from the paper "Australian Assassins, Part I: A review of the Assassin Spiders (Araneae, Archaeidae) of mid-eastern Australia":



This figure illustrates Austrarchaea nodosa (Forster, 1956), and Plazi has a treatment of that taxon: http://treatment.plazi.org/id/1072F469192A5BA015A1AA70A36E2C92. This treatment comprises a series of text blocks extracted from the paper, so there is not a great deal I can do with this unless I want to parse the text (e.g., for geographical coordinates and specimen codes). So yes, there is RDF (see http://treatment.plazi.org/GgServer/rdf/1072F469192A5BA015A1AA70A36E2C92) but it adds little to the existing knowledge graph.
To be fair, for some treatments in Plazi are a lot richer, for example http://tb.plazi.org/id/A94487F7E15AFFA5FF682EE9FEB45F2C which has references, geographical coordinates, and more. What would be useful would be an easy way to explore Plazi, for example, if the RDF was dumped into a triple store where we could explore it in more detail. I hope to look into this in the coming weeks.