Monday, February 04, 2008

Incomplete citation and ranking

Came across the paper "Using incomplete citation data for MEDLINE results ranking" (pmid:16779053, fulltext available in PMC .The authors applied PageRank (the algorithm Google use to rank search results) to papers in MEDLINE and found that PageRank is robust to information loss. In other words, even if a citation database is incomplete it will do a good job of ranking results. This is encouraging, as I'm keen to use this approach to rank both papers and other objects (e.g., sequences and specimens), and will almost certainly never have a complete citation list.

