Tuesday, February 26, 2008

Encyclopedia of Life - first impressions

Some thoughts on the first release of the Encyclopedia of Life. I am being deliberately critical. This is a high profile project with tens of millions of dollars in funding, lots of people involved, and is accompanied by some of the most overblown hype in organismal biology. In a sense I think EOL has set itself up by over promising and under delivering.

Before continuing, I should point out that I am involved in EOL in an advisory capacity, but not in actually making anything. Some of the tools I've blogged about have made there way into EOL, such as Pygmybrowse and reference parsing (see David Shorthouse's excellent work on this).

Lack of content
I think the first release of EOL should have, at a minimum, provided at least as much information that I can get from iSpecies and Wikipedia. Other projects, such as Freebase, have pre-populated their databases with content from Wikipedia and other sources. Why didn't EOL? If the argument is that they want authenticated content, then this doesn't wash. Their authenticated content is minimal, and waiting for authentication will, in my view, cripple EOL.

Exemplars are incomplete
The first release contains 25 exemplars. Pages for these taxa
...show the kind of rich environment, with extensive information, to which all the species pages will eventually grow. The information on the exemplar pages has been authenticated (endorsed) by the scientists whose names are listed on these pages.
Well, I hope this isn't the standard EOL aspires to. The pages are incomplete and not interlinked. One of the 25 chosen exemplars is Anolis carolinensis. EOL lists its distribution as:
Widely-distributed throughout the southeastern United States: North Carolina to Key West, Florida, and west to southest Oklahoma and central Texas.

However, the GBIF map EOL displays shows lots of dots in Hawaii:


The EOL account is silent on this interesting distribution pattern. It will come as no suprise that the Wikipedia account of the same species tells us that it has been introduced into Hawaii. Wikipedia 1, EOL 0.

Links

If two pages talk about species that are ecologically associated, then surely those pages should be linked? Among the exemplars is Pissodes strobi, the white pine weevil. In the EOL account, among the hosts listed is Pinus strobus, another exemplar taxon. The accounts of these two taxa are not linked. No hyperlink, nothing. The reader has no idea that there is an exemplar account for Pinus strobus. Furthermore, when reading the account for Pinus strobus there is no indication that it is host to the white pine weevil.
Surely the point of having all this information in one place is so that it can be linked together?

BHL
EOL also exposes some limitations of the Biodiversity heritage Library. Consider the exemplar page for Pinus strobus L. The "L." indicates that this species was described by Linnaeus. Among the many references listed by BHL, none are by Linnaeus. What gives?

Well, the IPNI record reveals that this species was described on p. 1001 of Species Plantarum. BHL has digitised Species Plantarum, and page 1001 has Pinus strobus:



Now, BHL relies on uBio's tools to extract names, and Linnaeus didn't make this easy (the specific epithet strobus is in the right hand margin, separate from Pinus), but one would have thought that for the exemplar taxa an effort would have been made to link Linnaean names to BHL content -- what better place to showcase the link between a name and its publication? It's quite easy to do, given that IPNI has page numbers for plant names. Just map page numbers to BHL URLs, and you're done.

Inconsistency
Going down the taxonomic hierarchy weird things happen. When viewing the plant genus Morus if I can see a picture of Morus nigra (presumably this is "authenticated" content). If I drill down to the species Morus nigra, I'm told there is no authenticated content for this species. Either the image is Morus nigra or it isn't. If it is, why not show it, if it isn't, why claim that it is?



Logos

Way too much space is devoted to logos of various contributors, BHL being the worst offender (it doesn't help that the BHL content is incomplete, lacking links for Linnaean names). I don't care about logos. Contributors may care about getting their logos displayed, but users couldn't care less. They get in the way. On some pages, there's more screen space devoted to logos than information (e.g., the page for Apomys datae). This is, frankly, ridiculous, and reflects a warped set of priorities.

What's worse, all these logos are associated with links that take people away from EOL. Hence EOL becomes little more than a collection of web links to other sites.

Search
The search is based on the Catalogue of Life, and inherits the same problems. For example, if I search for "Morus" I get a list in alphabetical order of taxonomic names that contain the string "morus". The two names that are an exact match occur as items three and four on the list -- they should be first and second.

It gets worse if I search on "Tyrannosaurus rex". EOL doesn't do dinosaurs, and so doesn't contain anything on T. rex, but the search results tell me that The following 116 search results contain 'Tyrannosaurus rex'. Nope, none of them do.

The search engine is poorly done, it fails to rank results sensibly, incorrectly reports what it does find, and has no support for spelling mistakes.

Authenticated content
This is probably the thing that, if left as it is, will strangle EOL. The insistence on "authenticated (endorsed)" content places a severe brake on what EOL can offer.

It's a web site
EOL's web site has no mechanism for people to extract data (e.g., RSS feeds, microformats, links to RDF, etc.). It's intended to be read by humans, not machines. This greatly diminishes its utility.

So, I've got that off my chest. The first release was always going to be a disappointment, especially given the hype. What frustrates me, however, is just how far the first release is from what it could have been.

The real question is how much the issues I've raised are things which are easy to fix given time, or whether they reflect underlying problems with the way the project is conceived.

32 comments:

David Shorthouse said...

These are but a small portion of the issues now on the table and we were/are fully aware of all of them. Some we could have fixed, some we spotted and did fix, but other functions were needlessly crippled because we had to perform major fixes. I won't bore with the details. However, there are some shining lights, which have largely gone undiscussed or unnoticed.

First, all front-end materials are built off RESTful web services. Granted these web services may only be of value to reproducing an EOL page (and there will be a review to assess that), but it none the less affords us an opportunity to open the doors more widely than they are now. Hats off to Patrick Leary.

Second, we forged an agreement with CrossRef so we will be assigning DOIs to pages. The exact mechanism needs more thought (attribution - how with multiple authors & a ton of AJAX?).

Indeed the release was rushed. Because it was released as is, warts and all, we are open to plenty of criticism. That's the only way we can now proceed. If we flew through this release without so much as a whimper, it would surely have been a disappointment. Not having sufficient time to test loads on the servers was by far the biggest upset for us.

We are also tracking all comments in our forum, all responses sent via email on the servers, and comments in our blog. These are all being triaged and will be part of the critical post-launch review.

By the way, there's no reason to nod my work on the ref parser. I merely made a pretty front-end to your deeper work on this ;)~ Did ya catch my mildly informative and apologetic screencasts...that is, when the site is up?

Roderic Page said...

The fact that EOL buckled under the (enormous) load suggests that something like Akamai might be needed.

Regarding DOIs and attribution, I'm not convinced that having "authors" makes sense. Many pages will be automatically generated from remote sources. In this sense, the "author" is EOL. By all means have a means to list who did what, but I wonder whether the model of having an author makes sense. I would be a shame if the combination of authorship and "authentication/endorsement" got in the way of things.

Anonymous said...

the plant genus Morus

Yay, another intercode homonym. Hooray.

Ian said...

I just wanted to let you know that I linked to this post from Berry Go Round, a new plant-focussed blog carnival.

Chris Freeland said...

Rod - Any idea where BHL can grab names from Syst. Nat.? We can get Sp. Pl. names from Tropicos & IPNI, but at a loss for where to get names from Syst. Nat. without manual entry.

Roderic Page said...

Chris, as far as I know there isn't an equivalent list. For generic and subgeneric names I guess you could harvest uBio's Nomenclator Zoologicus data, which includes page-level citations (e.g., the record for Pediculus humanus).

For species names, unless somebody has made a universal list, I guess you're stuck with what taxon-specific nomenclators can provide.

Donat Agosti said...

Chris

as Rich Pyle at ZooBank who has the entire 10th edition of Sys.Nat. in a database.

Then you might get in touch with the animalbase.de team who actually also tries to extract names from legacy publications making their way up from 1758, and just got a new grant to continue their scanning operations.

The third person you might want to ask about this is Dave Remsen at GBIF who, to the best of my knowledge, has been talking to the animalbase.de organiziers.

Anonymous said...

black mold exposureblack mold symptoms of exposurewrought iron garden gatesiron garden gates find them herefine thin hair hairstylessearch hair styles for fine thin hairnight vision binocularsbuy night vision binocularslipitor reactionslipitor allergic reactionsluxury beach resort in the philippines

afordable beach resorts in the philippineshomeopathy for eczema.baby eczema.save big with great mineral makeup bargainsmineral makeup wholesalersprodam iphone Apple prodam iphone prahacect iphone manualmanual for P 168 iphonefero 52 binocularsnight vision Fero 52 binocularsThe best night vision binoculars here

night vision binoculars bargainsfree photo albums computer programsfree software to make photo albumsfree tax formsprintable tax forms for free craftmatic air bedcraftmatic air bed adjustable info hereboyd air bedboyd night air bed lowest pricefind air beds in wisconsinbest air beds in wisconsincloud air beds

best cloud inflatable air bedssealy air beds portableportables air bedsrv luggage racksaluminum made rv luggage racksair bed raisedbest form raised air bedsaircraft support equipmentsbest support equipments for aircraftsbed air informercialsbest informercials bed airmattress sized air beds

bestair bed mattress antique doorknobsantique doorknob identification tipsdvd player troubleshootingtroubleshooting with the dvd playerflat panel television lcd vs plasmaflat panel lcd television versus plasma pic the bestThe causes of economic recessionwhat are the causes of economic recessionadjustable bed air foam The best bed air foam

hoof prints antique equestrian printsantique hoof prints equestrian printsBuy air bedadjustablebuy the best adjustable air bedsair beds canadian storesCanadian stores for air beds

migraine causemigraine treatments floridaflorida headache clinicdrying dessicantair drying dessicantdessicant air dryerpediatric asthmaasthma specialistasthma children specialistcarpet cleaning dallas txcarpet cleaners dallascarpet cleaning dallas

vero beach vacationvero beach vacationsbeach vacation homes veroms beach vacationsms beach vacationms beach condosmaui beach vacationmaui beach vacationsmaui beach clubbeach vacationsyour beach vacationscheap beach vacations

bob hairstylebob haircutsbob layeredpob hairstylebobbedclassic bobCare for Curly HairTips for Curly Haircurly hair12r 22.5 best pricetires truck bustires 12r 22.5

Unknown said...

派遣
不動産
インプラント
出会いサイト
クレジットカード 現金化
FX

Unknown said...

出会い
投資
データ復旧
出会い系サイト
不動産
コンタクトレンズ
アフィリエイト

Unknown said...

派遣情報サイトには、魅力的なお仕事をたくさん掲載しています。派遣お仕事をお探しの皆さまにとって、より使いやすく便利なサイトにするべく、アデコ派遣情報サイトをリニューアルしました。

Unknown said...

美容整形することによって絶対的な美を得られるわけではありません。美容整形『自分は変わった』という事実を物理的に確認することで、気になって仕方がなかった自分 の体に対するコンプレックスから解放される。美容整形そこではじめて心を研ぎ澄まし、自分の内面を磨いていくことができるようになるのです。そうして人は美しく なっていく。美容整形外見だけ磨こうとする人は美しくなれない、というのが私の持論です」

Unknown said...

外国為替証拠金取引は元本や利益を保証するものではなく、外国為替相場の変動や金利差により損失が生じる場合がございます。外国為替お取引の前に十分内容を理解し、外国為替ご自身の判断でお取り組みください

Anonymous said...

クレジットカード現金化とは、キャッシング枠を枠一杯利用済みで、さらに現金を必要としている方を狙った、アンダーグラウンドなサービスです。
ク レジットカードには、通常、ショッピング専用のショッピング枠と、キャッシング専用のキャッシング枠が存在しています。キャッシング枠を目一杯利用してい ると、当然ながら、カードで現金を借りることが出来なくなります。ショッピングは可能な状態ですが、そのショッピング枠だと、利用用途や利用場所に制限が 生まれます。

Anonymous said...

ショッピング枠現金化するので誰でもかんたんにカードでお金をおつくりできます♪ご融資などではありませんので審査や面倒な手続きは一切ございません! ご返済方法は一括・リボ・分割(最高20回)・ボーナス一括などからお選びいただけます。当店は女性スタッフも対応しております。ご利用方法などはお気軽 に。ご利用の可否、ご利用詳細など安心・丁寧をモットーに即回答しています。急場の資金つくりにお役に立てるサポートをさせていただきます。

Anonymous said...

コンタクトレンズ
婚約指輪
債務整理
新宿 賃貸
印鑑
新宿 マッサージ
多重債務
花粉症
人材派遣
バイク便
就職ナビ
不動産 東京

Anonymous said...

出会い系
アクサ
チューリッヒ
行政書士
結婚式
有料老人ホーム
クレジットカード 現金化
ショッピング枠 現金化

Anonymous said...

賃貸 中央線
賃貸 丸ノ内線
賃貸 新築
賃貸 京浜東北線
賃貸 大田区
賃貸 北区
賃貸 江東区
賃貸 楽器可
賃貸 手数料なし
賃貸 保証人不要
賃貸 駅5分以内
賃貸 部屋探し
東京 部屋探し
デザイナーズ 賃貸
賃貸 分譲仕様
賃貸 中央区
賃貸 京王線
賃貸 東横線
賃貸 品川
賃貸 渋谷
賃貸 新宿
賃貸 杉並
賃貸 世田谷
賃貸 千代田区
賃貸 池袋
賃貸 中野
賃貸 文京区
賃貸 港区
賃貸 目黒
賃貸 ペット可

Anonymous said...

クレジットカード現金化

Anonymous said...

ショッピング枠現金化

Anonymous said...

不動産投資
不動産
格安 名刺
賃貸
名刺作成
価格
名刺 激安
価格比較

Anonymous said...

seo
seo対策
seo
SEO対策
seo
SEO対策
seo
SEO対策
seo
SEO対策
seo
SEO対策
seo
SEO対策

Anonymous said...

無料出会い
出会い系サイト
アダルトサイト
アダルト

Anonymous said...

マンション 買取 1戸建て 査定 1戸建て 買取 SEO対策 福岡 賃貸 車買取 自動車保険 バイク買取 美容整形 労働問題 収益物件不動産売却などにはマンション査定土地売買1戸建て売却が含まれる。 物件探しは広島 不動産 岡山 不動産 松山市 不動産 香川県 不動産 徳島 不動産 高知 不動産 高松 不動産をフルカバーしてます大手で 和歌山 富山 滋賀 石川 山梨 新潟 沖縄 大分 鹿児島 宮崎 熊本 高知

Anonymous said...

不動産 投資 新築マンション インプラント 広島 引越し マンション 売却 不動産 査定 不動産 売買 広島 賃貸 システム開発 土壌汚染 webシステム開発 土地 買取 不動産会社 ホームページ制作 賃貸 長野不動産富山不動産石川不動産福井不動産愛知不動産岐阜不動産三重不動産兵庫不動産滋賀不動産奈良不動産和歌山不動産鳥取不動産島根不動産山口不動産徳島不動産香川不動産愛媛不動産高知不動産佐賀不動産長崎不動産大分不動産宮崎不動産沖縄不動産 ホームページ制作 東京 原油 賃貸

Anonymous said...

不動産 買取 広島市 インプラント 不動産 賃貸 収益物件 マンション 売買 土地 売却 札幌 不動産 仙台 不動産 大阪 不動産 横浜 不動産 名古屋 不動産 福岡 不動産 京都 不動産 埼玉 不動産 千葉 不動産 静岡 不動産 神戸 不動産 浜松 不動産 堺市 不動産 川崎市 不動産 相模原市 不動産 姫路 不動産 岡山 賃貸 明石 賃貸 鹿児島 不動産 北九州市 不動産 熊本 不動産 投資 土地 査定 SEO対策青森不動産北海道不動産岩手不動産宮城不動産秋田不動産山形不動産福島不動産群馬不動産栃木不動産茨城不動産山梨不動産新潟不動産プレジデント

Anonymous said...

Today, the Microsoft-owned in-game ad agency said that it has signed an exclusive multiyear agreement with Blizzard. Azerothians opposed to seeing in-game ads in their localworld of warcft goldwatering holes need not worry, however, because the deal is limited to Blizzard's Web sites and Battle.net,the game maker's online-gaming hub. Terms of the deal were not announced, but Massive did note that the agreement is applicable to users in the US, Canada, Europe, South Korea, and Australia.
buy wow gold


Massive also said today that it would be extending its aforementioned deal with Activision to encompass an additional 18 games appearing on the Xbox 360 and PC.cheap wow goldThe agency didn't fully delineate which would fall under this deal, though it did call out Guitar Hero: World Tour, James Bond: Quantum of Solace, and Transformers: Revenge of the Fallen,buy wow items as well as games in its Tony Hawk and AMAX Racing franchises.Shortly before Activision and Vivendi announced their deal of the decade,wow power levelingthe Guitar Hero publisher signed on to receive in-game advertisements from Massive Inc for a number of its Xbox 360 and PC games. A bit more than a year later, Massive is now extending its reach to Activision's new power player, Blizzard Entertainment.buy wow gold from our site ,you'll get more surprises!

Anonymous said...

杭州装修公司
杭州店面装修
杭州办公室装修
杭州装饰公司
杭州装饰公司

蜂王浆
芦荟
蜂胶
蜂王浆
芦荟
蜂胶

ball valve球阀
gate valve闸阀
angle valve角阀
bibcock水嘴
tap
Check valve
hot-water heating
fittings
苏州led
上海led
北京led
苏州电磁铁
苏州装修公司
苏州装饰公司
ats
ATS生产
ats
ATS开关

Unknown said...

SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策
SEO
SEO対策

Unknown said...

不動産投資
不動産
格安 名刺
賃貸
名刺作成
価格
価格比較
名刺 激安
大田区賃貸 北区賃貸 江東区賃貸 品川賃貸 渋谷賃貸 新宿賃貸 杉並賃貸 世田谷賃貸 中央区賃貸 千代田区賃貸 池袋賃貸 中野賃貸 文京区賃貸 港区賃貸 目黒賃貸 新築賃貸 ペット可賃貸 楽器可賃貸 手数料なし賃貸 保証人不要賃貸 駅5分以内賃貸 部屋探し東京 部屋探しデザイナーズ 賃貸賃貸 分譲仕様賃貸 中央線賃貸 京浜東北線賃貸 京王線賃貸 東横線賃貸 丸ノ内線

Anonymous said...

情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情趣,情趣,情趣,情趣,A片,視訊聊天室,聊天室,視訊,視訊聊天室,080苗栗人聊天室,上班族聊天室,成人聊天室,中部人聊天室,一夜情聊天室,情色聊天室,視訊交友網

免費A片,AV女優,美女視訊,情色交友,免費AV,色情網站,辣妹視訊,美女交友,色情影片,成人影片,成人網站,A片,H漫,18成人,成人圖片,成人漫畫,情色網,日本A片,免費A片下載,性愛

A片,色情,成人,做愛,情色文學,A片下載,色情遊戲,色情影片,色情聊天室,情色電影,免費視訊,免費視訊聊天,免費視訊聊天室,一葉情貼圖片區,情色,情色視訊,免費成人影片,視訊交友,視訊聊天,視訊聊天室,言情小說,愛情小說,AIO,AV片,A漫,avdvd,聊天室,自拍,情色論壇,視訊美女,AV成人網,色情A片,SEX,成人論壇

情趣用品,A片,免費A片,AV女優,美女視訊,情色交友,色情網站,免費AV,辣妹視訊,美女交友,色情影片,成人網站,H漫,18成人,成人圖片,成人漫畫,成人影片,情色網


情趣用品,A片,免費A片,日本A片,A片下載,線上A片,成人電影,嘟嘟成人網,成人,成人貼圖,成人交友,成人圖片,18成人,成人小說,成人圖片區,微風成人區,成人文章,成人影城,情色,情色貼圖,色情聊天室,情色視訊,情色文學,色情小說,情色小說,臺灣情色網,色情,情色電影,色情遊戲,嘟嘟情人色網,麗的色遊戲,情色論壇,色情網站,一葉情貼圖片區,做愛,性愛,美女視訊,辣妹視訊,視訊聊天室,視訊交友網,免費視訊聊天,美女交友,做愛影片

av,情趣用品,a片,成人電影,微風成人,嘟嘟成人網,成人,成人貼圖,成人交友,成人圖片,18成人,成人小說,成人圖片區,成人文章,成人影城,愛情公寓,情色,情色貼圖,色情聊天室,情色視訊,情色文學,色情小說,情色小說,色情,寄情築園小遊戲,情色電影,aio,av女優,AV,免費A片,日本a片,美女視訊,辣妹視訊,聊天室,美女交友,成人光碟

情趣用品.A片,情色,情色貼圖,色情聊天室,情色視訊,情色文學,色情小說,情色小說,色情,寄情築園小遊戲,情色電影,色情遊戲,色情網站,聊天室,ut聊天室,豆豆聊天室,美女視訊,辣妹視訊,視訊聊天室,視訊交友網,免費視訊聊天,免費A片,日本a片,a片下載,線上a片,av女優,av,成人電影,成人,成人貼圖,成人交友,成人圖片,18成人,成人小說,成人圖片區,成人文章,成人影城,成人網站,自拍,尋夢園聊天室

gener said...

Hi. I would be a shame if the combination of authorship and "authentication/endorsement" got in the way of things.