Wednesday, August 20, 2008

NCBI visualisations I - Genbank Timemap

Time for some fun. In between some tedious text mining I've been meaning to explore some visualisations of NCBI. Here's the first, inspired by Jörn Clausen's wonderful Live Earthquake Mashup (thanks to Donat Agosti for telling me about this). What I've done is take all the frog sequences in Genbank that are georeferenced, add the date those Genbank records were created, generate a KML file, and use Nick Rabinowitz's timemap to plot the KML. The result is here:

By dragging the time line you can see collections of sequences and where the frog samples came from. Clicking on a marker on the Google Map takes displays a link to the Genbank record. It's all pretty crude, but fun to play with. What I'm toying with is trying to do something like this for new taxa, i.e., a timemap showing where an when new species are described. Sort of a live biodiversity map like the earthquake mashup, albeit not quite so rapidly moving.


Javier de la Torre said...

Nice. When you talked about a timemap showing where an when new species are described, what do you exactly mean? Is there any place where you can find where a new species had been described? Geolocating the institution? Or you would use the location of the type specimen?

I am working on something similar to your view but based on GBIF data and using Heat Maps. If you like this kind of visualization and want to get ideas, and you dont mind Flash, check

Rod Page said...

The plan would be to use any locations I can get hold of, such as latitudes and longitudes scrapped from the publication text, and museum specimen records. It could be as coarse as country-level stuff (geocoding text). As this stage it's about giving people a snapshot of what's going on.

I'll take a look at SpatialKey.