iPhylo: Transitive reduction

Quick note to self, having stumbled on the Wikipedia page on transitive reduction. Given a graph like this:

the transitive reduction is:

Note that the original graph has an edge a -> d, but this is absent after the reduction because we can get from a to d via b (or c).

What's the point? Well, it occurs to me that a quick way of harvesting information about existing taxonomies (e.g., if we want to assemble an all embracing classification of life to hep navigate a database) is to make use of the titles of taxonomic papers, e.g., the title Platyprotus phyllosoma, gen. nov., sp. nov., from Enderby Land, Antarctica, an unusual munnopsidid without natatory pereopods (Crustacea: Isopoda: Asellota) gives us:

Crustacea -> Isopoda -> Asellota ->Platyprotus -> phyllosoma

From the paper, and or other sources we can get paths such as Asellota -> Munnopsididae -> Platyprotus and Isopoda -> Munnopsididae. Imagine that we have a set of these paths, and want to assemble a classification (for example, we want to grow the Species 2000 classification, which lacks this isopod). Here's the graph:

This clearly gives us information on the classification of the isopod, but it's not a hierarchy. The transitive reduction, however, is:

It would be fun to explore using this technique to mine taxonomic papers and automate the extraction of classifications, as well as names.

iPhylo

Wednesday, November 28, 2007

Transitive reduction

No comments:

Post a Comment