Thursday, August 02, 2007

Viewing very large trees


One of the striking pictures in Tamara Munzner et al.'s paper "TreeJuxtaposer: Scalable Tree Comparison using Focus+Context with Guaranteed Visibility" (doi:10.1145/882262.882291, also available here) is that of a biologist struggling to visualise a large phylogeny. The figure caption states that:
Biologists faced with inadequate tools for comparing large trees have fallen back on paper, tape, and highlighter pens.

I've been struggling with this problem in the context of display trees on a web page (see an earlier post). Viewing large trees has received a lot of attention, and there are some fun tools such as Tamara Munzner's TreeJuxtaposer and Mike Sanderson's Paloverde (doi:10.1093/bioinformatics/btl044 ), which was used to create the cover for the October 2006 issue of Systematic Biology. And let's not forget Google Earth.



The problem with standalone tools like these is that they are just that - standalone. They are meant to support interactive visualisation in an application, not viewing a tree on a web page. This is a particular problem facing TreeBASE. A user wanting to view, say, the marsupial supertree published by Cardillo et al. (doi:10.1017/S0952836904005539, TreeBASE study S1035) is greeted by the message:
This tree is too large to be seen using the usual GUI. We recommend that you view the tree using the java applet ATV or the program TreeView (see below). Alternatively, you can download the data matrix and view the tree(s) in MacClade, PAUP, or any other nexus-compatible software.
and the tree is displayed as Newick text string:
(((((((((((Abacetus, ((((Agonum, Glyptolenus), Europhilus, Tanystoma, Platynus), ((Morion, Moriosomus), Stenocrepis)), (((Licinus, Zargus, Badister), (Panagaeus, Tefflus)), Melanchiton)), ((Amara, Zabrus), (Harpalus, Dicheirotrichus, Parophonus, Trichocellus, Ophonus, Trichotichnus, Diachromus, Pseudoophonus, Stenolophus, Notobia, Bradycellus, Nesacinopus, Anisodactylus, Acupalpus, Acinopus, Xestonotus)), ((Anthia, Thermophilum), ((Corsyra, Discoptera), Graphipterus)), (((Apenes, (Chlaenius, Callistus)), Oodes), ((Calophaena, ((Ctenodactyla, Leptotrachelus), Galerita)), (Pseudaptinus, Zuphius))), (((Calleida, Hyboptera), Lebia), Cymindis, Demetrias, Dromius, Lionychus, Microlestes, Syntomus), ((Calybe, Lachnophorus), Odacantha), (Catapiesis, (Desera, Drypta)), Cnemalobus, (Coelostomus, (Eripus, Pelecium)), ...


Not the most compelling visualisation. What I hope to do in this and following posts is describe my own efforts to come to grips with this problem.

Requirements
To put the problem into perspective, what I'm looking for is a simple way to draw large trees for display in a web browser. This places severe limits on the kind of interactivity that is possible (unless we go down the root of Java applets, which I will avoid like the plague). This rules out, for example, trying to emulate TreeJuxtaposer's functionality. Initially I started looking at SVG, which renders graphics nicely, supports interaction, and being essentially an XML file, is easy to manipulate (for an example see my earlier post on SVG maps). However, SVG is not well supported in all browsers (FireFox does pretty well, most other browsers are variable). All browsers, however, support bitmap graphics (GIF, PNG, JPEG, etc.). When drawing complex things like trees bitmaps have some advantages, especially with regards to labelling. Small bitmap fonts tend to be more legible than anti-aliased fonts at the same size (see article at MiniFonts for background.

Animation
Comments so far on this post have focussed on animation (e.g., using Flash). Here is a video of TreeJuxtaposer taken from Tamara Munzner's web site.

For me the most interesting features of TreeJuxtaposer are that the entire tree is always visible (thus retaining context, unlike pan and zoom), and the user can select bits to view by drawing a rectangle on the screen. The processing to compute the transformations needed for large trees is fairly heavy duty, although newer algorithms have reduced this somewhat (see here).

14 comments:

David Shorthouse said...

This may be out in left field, but have a peek at the kind of thing geneologists have been doing with Flash & AJAX: http://www.geni.com

David Shorthouse said...

Rod: I assume you are also aware of iTOL: http://itol.embl.de/index.shtml where you can upload a Newick tree & immediately view it in Flash. Now, if they could offer a download of the resultant Flash file for embedding on a web page, this would be cool way to do it.

Roderic Page said...

David, I hadn't seen Geni, but had seen iTOL. I'm not convinced that a "pan and zoom" approach is the best way forward. It's easy to get lost, and there are issues of how well it scales. For example, I loaded a big bird tree into iTOL and it took a long time to render, and the leaf labels were invisible. I want quick rendering (suitable for viewing lots of trees in a database), and I think the rule has to be "don't draw labels unless they are legible". This is one lesson I draw from TreeJuxtaposer, which displays only a subset of the leaf labels at any one time, but you can always read them.

Mike Keesey said...

Whatever the exact implementation, Flash (or Flex) seems like the best technology to use for something like this. It does vector and bitmap, it has many tools for interactivity, and the browser plug-in is the most widely-spread one in existence.

I'm doing a project in Flex which requires drawing trees. So far the data in my test files has all fit fairly well, but soon I am going to run into this same problem and will need to come up with a creative solution....

Roderic Page said...

My own suspicion is that Flash/Flex, etc. may be over-engineering things. To view and interact with really big trees effectively is possibly beyond what these tools can offer. If you read the papers related to TreeJuxtaposer (and it's descendants), there are some sophisticated algorithms involved, some of which depend on making full use of the capabilities of recent graphics cards. This is possibly beyond the capabilities of browser plugins, but not stand alone programs.

Of course, I'd really love a Flash (or whatever) tool to display and interact with trees, but for now I'm looking for a simple technique that scales well (i.e., speedily displays trees with 1000's of nodes).

Mike Keesey said...

It seems to me that two closely-related limits need to be kept in mind: screen resolution and human perception. Most screens are going to be physically incapable of showing 1000s of nodes, and, even if they were, how many people could take in that amount of information, anyway? Whatever the solution is, it's not going to be so much about displaying all the information at once as allowing clever ways to focus on areas of the trees.

Flash is not capable of taking advantage of graphics cards, but the new virtual machine in Flash Player 9 is pretty fast--fast enough to do halfway decent 3D animation. Perhaps a good solution would involve Papervision3D, an open-source 3D library for Flash. (Check out the "Rhythm of lines" entry on that blog for a beautiful example of what can be done with it.)

Roderic Page said...

Yes, this is the challenge -- small screen, lots of nodes, limited ability to absorb all the information.

Flash clearly has a lot of potential, but I'm not sure how much of the problem is one of animation. However, it would be fun to see whether at least some of TreeJuxtaposer's functionality could be emulated in Flash.

Pierre Lindenbaum said...

Rod,
I don't know much things about philoegny but wouldn't be useful to use a hyperbolic tree ? just like http://198.202.68.14/tools/HyperTree.html

also may be of interest : my bookmarks about graphs on connotea

Pierre

Roderic Page said...

Hyperbolic trees are fun, and I've used them before, as have a few others in this field -- see Tim Hughes' pages on the Walrus 3D hyperbolic browser.

However, they can be distracting to navigate around, and in my experience they work best for trees with high degree (that is, each node has lots of descendants). Hence, they work well for trees such as the hierarchy of files and folders on a computer, or taxonomic classification, but are not great for binary trees (which is what most phylogenies aspire to be).

Lastly, for at least one 2D case the technology is protected by a US Patent held, I believe by Inxight. See US Patent 6901555, for example. Hence, one needs to be very careful using this approach in the US.

Anonymous said...

Hmm.. is that Daniel Huson?

Roderic Page said...

No, Inxight is a Xerox spin-off. Daniel has his own tree viewing software called Dendroscope. It is a Java program and supports viewing large trees. It also includes a rather nice "fish-eye" magnifier to display more detail on selected parts (very like the Dock magnification feature in mac Os X, one of the few widely used fish-eye effects.

Roderic Page said...

Doh! Simon, did you mean to ask whether Daniel Huson is the mysterious individual in the picture? Not sure, but it looks like him.

Anonymous said...

Don't know if you've seen TaxonTree, our contribution to the large tree visualization problem. You can see one of our implementations at our Lep ATOL site, LepTree.net.

It does provide integrated searching and browsing of a large tree on the web. We haven't worked on it recently but have plans for improvements this fall.

Obviously we disagree with TreeJuxtaposer's philosophy about labels, and the current layout works much better with classifications than with phylogenies.

But there it is anyway.

Roderic Page said...

Cyndy, I'd seen TaxonTree before. As you say, it works better with classifications (as do most large tree viewers that originate in the computer science literature). I'm also not sure what you mean by " browsing of a large tree on the web". It's a Java program that the user has to download and run separately (albeit via Web start). I guess I'm currently hoping we can avoid going down this route.