Some thoughts on the first release of the
Encyclopedia of Life. I am being deliberately critical. This is a high profile project with tens of millions of dollars in funding, lots of people involved, and is accompanied by some of the most overblown hype in organismal biology. In a sense I think EOL has set itself up by over promising and under delivering.
Before continuing, I should point out that I am involved in EOL in an advisory capacity, but not in actually making anything. Some of the tools I've blogged about have made there way into EOL, such as
Pygmybrowse and reference parsing (see David Shorthouse's
excellent work on this).
Lack of content I think the first release of EOL should have, at a minimum, provided at least as much information that I can get from
iSpecies and
Wikipedia. Other projects, such as
Freebase, have pre-populated their databases with content from Wikipedia and other sources. Why didn't EOL? If the argument is that they want authenticated content, then this doesn't wash. Their authenticated content is minimal, and waiting for authentication will, in my view, cripple EOL.
Exemplars are incompleteThe first release contains 25 exemplars. Pages for these taxa
...show the kind of rich environment, with extensive information, to which all the species pages will eventually grow. The information on the exemplar pages has been authenticated (endorsed) by the scientists whose names are listed on these pages.
Well, I hope this isn't the standard EOL aspires to. The pages are incomplete and not interlinked. One of the 25 chosen exemplars is
Anolis carolinensis. EOL lists its distribution as:
Widely-distributed throughout the southeastern United States: North Carolina to Key West, Florida, and west to southest Oklahoma and central Texas.
However, the GBIF map EOL displays shows lots of dots in Hawaii:

The EOL account is silent on this interesting distribution pattern. It will come as no suprise that the
Wikipedia account of the same species tells us that it has been introduced into Hawaii. Wikipedia 1, EOL 0.
Links
If two pages talk about species that are ecologically associated, then surely those pages should be linked? Among the exemplars is
Pissodes strobi, the white pine weevil. In the EOL account, among the hosts listed is
Pinus strobus, another exemplar taxon. The accounts of these two taxa
are not linked. No hyperlink, nothing. The reader has no idea that there is an exemplar account for
Pinus strobus. Furthermore, when reading the account for
Pinus strobus there is no indication that it is host to the white pine weevil.
Surely the point of having all this information in one place is so that it can be linked together?
BHLEOL also exposes some limitations of the
Biodiversity heritage Library. Consider the exemplar page for
Pinus strobus L. The "L." indicates that this species was described by Linnaeus. Among the many references listed by BHL, none are by Linnaeus. What gives?
Well, the
IPNI record reveals that this species was described on p. 1001 of
Species Plantarum. BHL has digitised
Species Plantarum, and
page 1001 has
Pinus strobus:

Now, BHL relies on uBio's tools to extract names, and Linnaeus didn't make this easy (the specific epithet
strobus is in the right hand margin, separate from
Pinus), but one would have thought that for the exemplar taxa an effort would have been made to link Linnaean names to BHL content -- what better place to showcase the link between a name and its publication? It's quite easy to do, given that
IPNI has page numbers for plant names. Just map page numbers to BHL URLs, and you're done.
InconsistencyGoing down the taxonomic hierarchy weird things happen. When viewing the plant genus
Morus if I can see a picture of
Morus nigra (presumably this is "authenticated" content). If I drill down to the species
Morus nigra, I'm told there is no authenticated content for this species. Either the image is
Morus nigra or it isn't. If it is, why not show it, if it isn't, why claim that it is?
Logos
Way too much space is devoted to logos of various contributors, BHL being the worst offender (it doesn't help that the BHL content is incomplete, lacking links for Linnaean names).
I don't care about logos. Contributors may care about getting their logos displayed, but users couldn't care less. They get in the way. On some pages, there's more screen space devoted to logos than information (e.g., the page for
Apomys datae). This is, frankly, ridiculous, and reflects a warped set of priorities.
What's worse, all these logos are associated with links that take people
away from EOL. Hence EOL becomes little more than a collection of web links to other sites.
SearchThe search is based on the
Catalogue of Life, and inherits the same problems. For example, if I search for "Morus" I get a list in alphabetical order of taxonomic names that contain the string "morus". The two names that are
an exact match occur as items three and four on the list -- they should be first and second.
It gets worse if I search on "Tyrannosaurus rex". EOL doesn't do dinosaurs, and so doesn't contain anything on
T. rex, but the search results tell me that
The following 116 search results contain 'Tyrannosaurus rex'. Nope, none of them do.
The search engine is poorly done, it fails to rank results sensibly, incorrectly reports what it does find, and has no support for spelling mistakes.
Authenticated contentThis is probably the thing that, if left as it is, will strangle EOL. The insistence on "authenticated (endorsed)" content places a severe brake on what EOL can offer.
It's a web siteEOL's web site has no mechanism for people to extract data (e.g., RSS feeds, microformats, links to RDF, etc.). It's intended to be read by humans, not machines. This greatly diminishes its utility.
So, I've got that off my chest. The first release was always going to be a disappointment, especially given the hype. What frustrates me, however, is just how far the first release is from what it could have been.
The real question is how much the issues I've raised are things which are easy to fix given time, or whether they reflect underlying problems with the way the project is conceived.