Thursday, March 24, 2011

TreeBASE meets NCBI, again

Déjà vu is a scary thing. Four years ago I released a mapping between names in TreeBASE and other databases called TBMap (described here: doi:10.1186/1471-2105-8-158). Today I find myself releasing yet another mapping, as part of my NCBI to Wikipedia project. By embedding the mapping in a wiki, it can be edited, so the kinds of problems I encountered with TbMap, recounted here, here, and here. The mapping in and of itself isn't terribly exciting, but it's the starting point for some things I want to do regarding how to visualise the data in TreeBASE.

Because TreeBASE 2 has issued new identifiers for its taxa (see TreeBASE II makes me pull my hair out), and now contains its own mapping to the NCBI taxonomy, as a first pass I've taken their mapping and added it to http://iphylo.org/linkout. I've also added some obvious mappings that TreeBASE has missed. There are a lot more taxa which could be added, but this is a start.

The TreeBASE taxa that have a mapping each get their own page with a URL of the form http://iphylo.org/linkout/<TreeBase taxon identifier>, e.g. http://iphylo.org/linkout/TB2:Tl257333. This page simply gives the name of the taxon in TreeBASE and the corresponding NCBI taxon id. It uses a Semantic Mediawiki template to generate a statement that the TreeBASE and and NCBI taxa are a "close match". If you go to the corresponding page in the wiki for the NCBI taxon (e.g., http://iphylo.org/linkout/Ncbi:448631) you will see any corresponding TreeBASE taxa listed there. If a mapping is erroneous, we simply need to edit the TreeBASE taxon page in the wiki to fix it. Nice and simple.

At the time of writing the initial mapping is still being loaded (this can take a while). I'll update this post when the uploading has finished.