Mauro Cavalcanti has released e-Species, "a taxonomically intelligent biodiversity search engine" written in Python that mimics much of the functionality of iSpecies. The project is open source, with a SourceForge page, although no files seem to be available yet. This is the second iSpecies clone I've seen, David Shorthouse having written a clone that uses only JSON.
One thing which distinguishes e-Species is the use of Catalogue of Life web services to provide some information on the name. However, it doesn't look like e-Species makes use of synonyms in its searches (i.e., what many refer to as "taxonomic intelligence"). Searching on two alternative names for the sperm whale (Physeter catodon and P. macrocephalus) yields different results (unless the underlying source knows that these names are synonyms, such as NCBI). Presumably, a taxonomically intelligent search would be able to merge results from searches using different names, and present those together.
Merging results requires some thought as to how to merge lists from different sources (e.g., merging lists of publications and images). This has been the subject of much study in the context of merging results from different search engines. Some starting points are:
- Rank Aggregation Methods for the Web
- Aggregation of partial rankings, p-ratings and top-m lists
- Tadpole: A Meta search engine
The last link is a student project and is a Microsoft Word document, which I've uploaded to Scribd and embedded below.
Dear Rod,
ReplyDeleteThanks for your comments on e-Species. Surely, it is just a starting point as a true "intelligent" taxonomic search engine requires much more.
With best regards,
I just made the Python source code for the e-Species engine available for downloading from e-Species website itself, as SourceForge is taking to much time to publish the uploaded file and I have time to figure out what is happening.
ReplyDelete