The idea is to display a table in a fixed space. As you mouse over a cell, the contents of the cell, and the relevant row and column labels become visible. This enables you to get an overview of the full table, but still see individual items:
It's easier to show than explain. For example, take a look at The amphibian tree of life, or watch this short screencast:
There are some things to fix. Firstly, I group all sequences by NCBI taxon and gene "features". If there's more than one sequence for the same gene and taxon, I just show one of them (an obvious solution is to add a popup menu if there's more than one sequence). Secondly, the gene "names" are extracted from GenBank feature tables, and will include synonyms and duplicates (for example, a sequence may have a gene feature "RAG-1" and a CDS feature "recombination activating protein 1"). I've stored all of these as not every sequence is consistently labelled, so excluding one class of feature may loose all labels from a sequence. At some point it would be useful to cluster gene names (a task for another day).
I like the new gene matrix feature much better than the list. You can see at a glance the level of matrix completion.
ReplyDeleteWhilst browsing, I came across a funny little typo. In a study of "Evolutionary history of Lake Tanganyika's scale-eating cichlid fishes." http://iphylo.org/~rpage/challenge/www/uri/3de601628f7a05eeafd47b8adc06de63 there is a sequence of an uncultured bacterium from the feces of an elderly human (AY920092). This is most likely a typo somewhere in the article where they meant AY930092 which is a sequence from a cichlid (Cheilochromis euchilus).