Sunday, August 28, 2011

Tree of Life 0.1 - annotating the NCBI taxonomy

Last week I was at the NSF "Assembling, Visualising and Analysing the Tree of Life" Ideas Lab, run by KnowInnovation.com/. It was an interesting experience, essentially a structured week of brainstorming ideas.

One thing I came away with is the feeling that our notions of the "tree of life" are fuzzy, contradictory, and often probably unobtainable. It's tempting to imagine all sorts of wonderful visualisations, and loose sight of building something that is useful. Perhaps it's time instead to think of "Tree of Life version 0.1".

Imagine taking the NCBI taxonomy as a starting point. Yes it's incomplete, and has almost no fossils, but it's freely available and linked to a lot of data. Let's use a Google Maps-like viewer along the lines I explored earlier this year.

Then add annotation "tracks" to the tips. As a first pass these could be taken from the NCBI LinkOut service, such as the NCBI-Wikipedia mapping http://iphylo.org/linkout.

Ncbi 1

The NCBI tree is a classification rather than a phylogeny, so we could add greater phylogenetic content by linking to phylogenetic databases, such as TreeBASE and PhyLoTA. Imagine clicking on a node in the NCBI taxonomy and seeing a display of all the phylogenies centred on that node:

Ncbi 02

Now we have a way to navigate a large tree, view annotations, and display phylogenetic trees. All of this could be done fairly easily. The key is to have services keyed by the NCBI tax_id used to identify nodes on the tree.

Among the next steps would be to add additional "tracks", perhaps based on curated links analogous to the wiki-based NCBI-Wikipedia mapping. For example, very basic habitat data (marine or terrestrial) could be added, or geography, or host relationships (could be based in part on the data already in GenBank).

Given that the NCBI tree continues to grow, subsequent versions could be released as the tree changes. Or we could "fork" the NCBI tree and start to refine it based on phylogenetic information, and add taxa that aren't in the genome databases (these taxa will need consistent identifiers so we can map annotations on to them as well). Perhaps we could use something like Git to manage this tree, and to handle the necessary merging of updated versions of the NCBI tree. People could edit the tree, or indeed fork it and come up with their own.

Logo tmp reasonably smallThere are lots of ways to visualise trees (see TreeVis.net for some great examples), but what I'm after is a tool that is useful, that gives us a sense of what we know and what we don't. I suspect that one of the reasons we've struggled with visualising the tree of life is that there are lots of different notions about what it's for. In this case, I want a tool to navigate data about organisms, one that we can easily add annotations too.


Friday, August 26, 2011

I am not a number...I am an "ideator"

As part of the NSF "Assembling, Visualising and Analysing the Tree of Life" Ideas Lab that I took part in earlier this week I had an assessment of my "problem solving style" carried out using a service called FourSight. I'm hugely sceptical of attempts to classify people (I'm unique, aren't I?), but I took the test and turns out am an "Ideator". FourSight's web site defines an Ideator as one who:

  • Likes to look at the big picture
  • Enjoys toying with ideas and possibilities
  • Likes to stretch his or her imagination
  • Enjoys thinking in more global and abstract terms
  • Takes an intuitive approach to innovation
  • May overlook details

Details schmetails, it's the big picture folks!

Ideators are:

  • Playful
  • Imaginative
  • Social
  • Adaptable
  • Flexible
  • Adventurous
  • Independent

Liking this. OK, how do you care for ideators? We need:

  • Room to be playful
  • Constant stimulation
  • Variety and change
  • The big picture

That's right, leave us alone to think our great thoughts. Result! Then there's this totally superfluous category "Ideators annoy others by...".

  • Drawing attention to themselves
  • Being impatient when others don’t get their ideas
  • Offering ideas that are too off-the-wall
  • Being too abstract
  • Not sticking to one idea

Utter, utter, nonsense. Look at my blog, it's full of ideas that have been developed fully... oh, wait. And, maybe the blog thing is a bit attention seeking, and I guess saying "it sucks" is a tad impatient, and saying to a crowd of taxonomists "haven't we basically found every species bigger than my coffee cup?" is a little off-the-wall.

Good job these psychometric thingies are clearly bogus.