Open taxonomy
View more presentations from Roderic Page
All the presentations will be posted online, along with podcasts of the audio. Meantime, presentations by Dave Remsen and Chris Freeland are already online.
Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.
ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
PREFIX uniprot: <http://purl.uniprot.org/core/>
PREFIX tdwg_tn: <http://rs.tdwg.org/ontology/voc/TaxonName#>
PREFIX tdwg_co: <http://rs.tdwg.org/ontology/voc/Common#>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT ?name ?status ?doi ?date ?thumbnail
WHERE {
?ncbi uniprot:scientificName ?name .
?ncbi rdfs:seeAlso ?dbpedia .
?dbpedia dbpedia-owl:conservationStatus ?status .
?ion tdwg_tn:nameComplete ?name .
?ion tdwg_co:publishedInCitation ?doi .
?doi dcterms:date ?date .
OPTIONAL
{
?dbpedia dbpedia-owl:thumbnail ?thumbnail
}
}
ORDER BY ASC(?status)
It's morning and the coffee hasn't quite kicked in yet, but reading through recent TDWG TAG posts, and mindful of the upcoming meeting in New Orleans (which sadly I won't be attending) I'm seeing a mismatch between the amount of effort being expended on discussions of vocabularies, ontologies, etc. and the concrete results we can point to.
Hence, a challenge:
"What new things have we learnt about biodiversity by converting biodiversity data into RDF?"
I'm not saying we can't learn new things, I'm simply asking what have we learnt so far?
Since around 2006 we have had literally millions of triples in the wild (uBio, ION, Index Fungorum, IPNI, Catalogue of Life, more recently Biodiversity Collections Index, Atlas of Living Australia, World Register of Marine Species, etc.), most of these using the same vocabulary. What new inferences have we made?
Let's make the challenge more concrete. Load all these data sources into a triple store (subchallenge - is this actually possible?). Perhaps add other RDF sources (DBpedia, Bio2RDF, CrossRef). What novel inferences can we make?
I may, of course, simply be in "grumpy old arse" mode, but we have millions of triples in the wild and nothing to show for it. I hope I'm not alone in wondering why...
<?xml version="1.0"?>
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:tcommon="http://rs.tdwg.org/ontology/voc/Common#"
xmlns:toccurrence="http://rs.tdwg.org/ontology/voc/TaxonOccurrence#"
xmlns:uniprot="http://purl.uniprot.org/core/">
<uniprot:Molecule rdf:about="http://bio2rdf.org/genbank:EU566842">
<dcterms:created>2008-07-06</dcterms:created>
<dcterms:modified>2010-12-23</dcterms:modified>
<dcterms:title>EU566842</dcterms:title>
<dcterms:description>Xenopus borealis voucher MHNG:Herp:2644.64
cytochrome oxidase subunit I (COI) gene, partial cds; mitochondrial.</dcterms:description>
<dcterms:subject rdf:resource="http://purl.uniprot.org/taxonomy/8354"/>
<dcterms:relation rdf:parseType="Resource">
<rdf:type rdf:resource="http://rs.tdwg.org/ontology/voc/TaxonOccurrence#TaxonOccurrence"/>
<toccurrence:identifiedToString>Xenopus borealis</toccurrence:identifiedToString>
<toccurrence:decimalLatitude>0.66</toccurrence:decimalLatitude>
<geo:lat>0.66</geo:lat>
<toccurrence:decimalLongitude>37.5</toccurrence:decimalLongitude>
<geo:long>37.5</geo:long>
<toccurrence:verbatimCoordinates>0.66 N 37.5 E</toccurrence:verbatimCoordinates>
<toccurrence:country>Kenya</toccurrence:country>
<dcterms:identifier>MHNG:Herp:2644.64</dcterms:identifier>
</dcterms:relation>
</uniprot:Molecule>
</rdf:RDF>
Today, scholarly publisher sites receive over 2 billion visits per year from users who are unaffiliated with an institution yet convert less than 0.2% into a purchase or subscription. DeepDyve’s service is designed for these ‘unaffiliated users’ who need an easy and affordable access to authoritative information vital to their careers.
Lucas N. Joppa, David L. Roberts, Stuart L. Pimm The population ecology and social behaviour of taxonomists Trends in Ecology & Evolution doi:10.1016/j.tree.2011.07.010
Conventional wisdom is highly prejudiced. It suggests that taxonomists were a formerly more numerous people, are in 'crisis', are becoming endangered and are generally asocial. We consider these hypotheses and reject them to varying degrees.
Camilo Mora, Derek P. Tittensor, Sina Adl, Alastair G. B. Simpson, Boris Worm. How Many Species Are There on Earth and in the Ocean?. PLoS Biol 9(8): e1001127. doi:10.1371/journal.pbio.1001127
Mark J. Costello, Simon Wilson and Brett Houlding. Predicting total global species richness using rates of species description and estimates of taxonomic effort. Syst Biol (2011) doi:10.1093/sysbio/syr080
Their estimates of ~ 10,000 or so bacteria and archaea on the planet are so completely out of touch in my opinion that this calls into question the validity of their method for bacteria and archaea at all.