Wednesday, May 31, 2023

Ten years and a million links

As trailed on a Twitter thread last week I’ve been working on a manuscript describing the efforts to map taxonomic names to their original descriptions in the taxonomic literature.

The preprint is on bioRxiv doi:10.1101/2023.05.29.542697

A major gap in the biodiversity knowledge graph is a connection between taxonomic names and the taxonomic literature. While both names and publications often have persistent identifiers (PIDs), such as Life Science Identifiers (LSIDs) or Digital Object Identifiers (DOIs), LSIDs for names are rarely linked to DOIs for publications. This article describes efforts to make those connections across three large taxonomic databases: Index Fungorum, International Plant Names Index (IPNI), and the Index of Organism Names (ION). Over a million names have been matched to DOIs or other persistent identifiers for taxonomic publications. This represents approximately 36% of names for which publication data is available. The mappings between LSIDs and publication PIDs are made available through ChecklistBank. Applications of this mapping are discussed, including a web app to locate the citation of a taxonomic name, and a knowledge graph that uses data on researcher’s ORCID ids to connect taxonomic names and publications to authors of those names.

Much of the work has been linking taxa to names, which still has huge gaps. There are also interesting differences in coverage between plants, animals, and fungi (see preprint for details).

There is also a simple app to demonstrate these links, see https://species-cite.herokuapp.com.

Written with StackEdit.