Tuesday, October 15, 2013

What can Global Biodiversity Information Facility (GBIF) do for you?

I've recently been appointed Chair of the Science Committee of the Global Biodiversity Information Facility (GBIF) http://www.gbif.org [1]. The committee is a small group of people with a range of backgrounds, and one of our roles is to advise GBIF on matters scientific (e.g., what kinds of data GBIF should collect?, what kinds of scientific questions should GBIF help answer?, etc.).

There have been formal surveys (see the papers in the journal "Biodiversity Informatics" https://journals.ku.edu/index.php/jbi/issue/view/370/showToc ), meetings, and a "vision" statement (the "Global Biodiversity Informatics Outlook, http://www.biodiversityinformatics.org/ ). But there's always the chance that these fora may miss some points of view, so I'm keen to get feedback on what sort of things GBIF could do to improve the way it can help people tackle the scientific questions they are interested in.

For example, is there some fundamental limitation that GBIF has that prevents it being useful to you? Is there some feature/data type/geographic coverage/etc. that could be addressed that would make it more useful? Is there a role that GBIF should take on that it hasn't done so? A useful analogy might be to think of the central role GenBank plays in genomics, both as a place to archive your data (sequences), a repository of other people's data that you can access, and a research tool (e.g., BLAST searches to locate similar sequences). Is that the sort of thing you'd want from GBIF, or is it something entirely different?

I'd welcome any comments, suggestions, views, etc. Feel free to add them as comments to this blog, or email me (rdmpage at gmail.com).

I should stress that this is simply me trying to calibrate my perception of GBIF's role with what others think. Also, note if you have specific comments on things such as the GBIF web site please use the feedback tab on the site (that way it will reach the people who can do something about it).

[1] For those unfamiliar with GBIF, its mission "is to make the world's biodiversity data freely and openly available via the Internet". At present the bulk of the data are observations of organisms (mostly multicellular eukaryotes, i.e., animals, plants and fungi) based on either museum collections or observations of living organisms. You can get an idea of the kind of science that uses GBIF-hosted data from this list of papers on Mendeley http://www.mendeley.com/groups/1068301/gbif-public-library/


Based on responses so far I'll compile a list below of suggestions/themes.


  • Have the ability to annotate records (e.g., flag errors) and some mechanism where those annotations get incorporated into GBIF and/or primary data providers.

Dashboard/gap analysis

  • For any search provide information on how complete and/or representative the data is likely to be (for example, are vertebrates over-represented, what is the extent of sampling in this area, etc.).

Geographic coverage

  • Fill big gaps in coverage (e.g., Russia, China, much of the tropics).


  • Link GBIF occurrence records to sequences in GenBank


  • Who identified specimen?
  • Details on georeferencing (esp. if not GPS)

Data types

  • DNA sequences
  • abundance

Data sources

  • GenBank
  • Literature records (e.g., data mining published papers)
    MEIER, R., & DIKOW, T. (2004). Significance of Specimen Databases from Taxonomic Revisions for Estimating and Mapping the Global Species Diversity of Invertebrates and Repatriating Reliable Specimen Data. Conservation Biology, 18(2), 478–488. doi:10.1111/j.1523-1739.2004.00233.x
  • "Gray" literature, e.g. field books, reports


  • Lack of stable identifiers for occurrences
  • Contributors of specimen data not (yet) in an institution have to mint their own identifiers, with no way of linking those to any future identifier minted by the institution that will eventually house their collection)


  • Being able to refine taxon search by geographic region
  • Search on any Darwin Core field
  • Wild card search
  • Support for GIS data formats
  • Search using arbitrary bounding polygons (e.g., draw a shape on a map)


Tuesday, October 08, 2013

Which taxonomic journals should be digitised next?

One reason I was able to build BioNames is because a significant fraction of the taxonomic literature for animals is now online, either due to the efforts of the Biodiversity Heritage Library, digital archives, commercial publishers, or individual institutions and scientific societies. However there are still big gaps in literature availability. To get a sense of these gaps I've constructed a table listing all the journals in BioNames that have an ISSN, ordered by the number of articles in BioNames (i.e., mostly articles that publish new names). The full table is here, I've reproduced part of it below (limited to those journals with at least 500 articles in BioNames). If you click on the ISSN in the table you can go to the corresponding page in BioNames to get full details of what BioNames currently knows about that journal.

The journals in red are the ones with the worst online presence (see complete key below). Note that BioNames is still a work in progress so there will be some journals that are online but I've simply not had a chance to add them to BioNames. With that in mind, there are some striking gaps in the digital availability of taxonomic publications. Several Russian journals (collectively publishing thousands of articles) are not online (the story here is somewhat complicated because some Russian journals also have English-language translations available but these are mostly recent articles). A number of large entomological journals are not available (perhaps not surprising given that most described animal taxa are insects).

We can think of this as a "league table" of literature availability. My hope is that digitising projects such as the Biodiversity Heritage Library will look at this and use it to help prioritise which journals to scan. In particular, if the journal is not pre-1923 (and therefore out of US copyright) I hope BHL will then contact the journal's publisher and see if they would be willing to add their journal to those (such as Proceedings of the Biological Society of Washington) that have opened up their complete back catalogue to being scanned by BHL.

I also hope that scientific societies or organisations that publish journals in the "red" or "orange" zones will consider digitising their journals and making their contents accessible to the wider community. We are reaching the point where if knowledge is not online then it effectively doesn't exist.

> 90%Almost all are available
< 90%Most are available
< 50%Limited availability
< 10%Mostly inaccessible
ISSN (click for details)JournalArticlesDigitised% digitised
0374-5481The Annals and magazine of natural history4463350278
1000-0739880-01 Dong wu fen lei xue bao. Acta zootaxonomica Sinica3403245072
0006-324XProceedings of the Biological Society of Washington3384326396
0022-3360Journal of paleontology3373312193
0037-928XBulletin de la Société entomologique de France30122448
0013-8797Proceedings of the Entomological Society of Washington2972280594
0044-5134Zoologicheskiĭ zhurnal2812161
0044-5231Zoologischer Anzeiger276159422
0022-3395The Journal of parasitology2353222294
0008-347XThe Canadian entomologist2260205991
0003-0082American Museum novitates1942181493
0035-418XRevue suisse de zoologie1851158185
0022-2933Journal of natural history1848182399
0367-1445Entomologicheskoe obozrenie180330
0096-3801Proceedings of the United States National Museum1722136579
0013-872XEntomological news1691161996
0370-2774Proceedings of the Zoological Society of London1580100864
1000-7482880-01 Kun chong fen lei xue bao = Entomotaxonomia1518112774
0037-9271Annales de la Société entomologique de France149775751
0031-031X880-01 Paleontologicheskiĭ zhurnal1472312
0013-8746Annals of the Entomological Society of America1441138396
0035-1814Revue de zoologie et de botanique africaines1400473
0031-0603The Pan-Pacific entomologist1389564
0323-6145Berliner entomologische Zeitschrift / herausgegeben von dem Entomologischen Vereine in Berlin134271053
1148-8425Bulletin du Muséum National d'Histoire Naturelle réunion mensuelle des naturalistes du Muséum130350639
0013-8908The Entomologist's monthly magazine126860
0001-6616880-03 Gu sheng wu xue bao = Acta palaeontologica Sinica112700
0165-5752Systematic parasitology1082102895
0454-6296880-01 Kun chong xue bao = Acta entomologica Sinica / Zhongguo kun chong xue hui bian ji105490286
0024-0672Zoologische mededeelingen / uitgegeven vanwege 's Rijksmuseum van Natuurlijke Historie te Leiden103999796
0370-047XProceedings of the Linnean Society of New South Wales103874271
0030-5316Oriental insects103591689
0028-7199Journal of the New York Entomological Society101386085
0521-4726Annales historico-naturales Musei Nationalis Hungarici = Természettudományi Múzeum évkönyve100788688
0070-7279Reichenbachia / Staatliches Museum für Tierkunde in Dresden95120
0022-8567Journal of the Kansas Entomological Society94590696
0373-3491Bollettino della Società entomologica italiana940141
0037-2102Senckenbergiana biologica939111
0002-8320Transactions of the American Entomological Society92379686
0374-9797Nouvelle revue d'entomologie92310
0034-7108Revista Brasileira de biologia91661
0007-1595Bulletin of the British Ornithologists' Club91145950
0013-8843Entomologische Zeitschrift88140
0253-116XLinzer biologische Beiträge / Oberösterreiches Landesmuseum87650357
0272-4634Journal of vertebrate paleontology86986499
1217-8837Acta zoologica Academiae Scientiarum Hungaricae86813415
0085-5626Revista brasileira de entomologia86326030
0365-4389Annali del Museo civico di storia naturale "Giacomo Doria."85550359
0097-3157Proceedings of the Academy of Natural Sciences of Philadelphia84850059
0010-065XThe Coleopterists' bulletin83180497
0024-4082Zoological journal of the Linnean Society823821100
0008-4301Canadian journal of zoology81780398
0028-1344The Nautilus81450162
0040-7496Tijdschrift voor entomologie80458072
0375-0434Proceedings of the Royal Entomological Society of London. Series B, Taxonomy79678398
0164-7954International journal of acarology787786100
0003-0090Bulletin of the American Museum of Natural History77648863
0037-962XBulletin de la Société zoologique de France76522830
0181-0863Revue française d'entomologie76561
1562-0891Wiener Entomologische Zeitung75257376
1000-3118880-01 Gu ji zhui dong wu xue bao74341
0003-0023Transactions of the American Microscopical Society731728100
0075-6547Koleopterologische Rundschau / herausgegeben von der Zoologisch-Botanischen Gesellschaft gemeinsam mit der Forstlichen Bundesversuchsanstalt70633948
0286-9810880-01 The entomological review of Japan = Konchūgaku hyōron7049814
0042-3580Venus : Japanese journal of malacology = Kairuigaku zasshi68753177
0067-1975Records of the Australian Museum67962993
0006-6982The Journal of the Bombay Natural History Society6778112
0320-9180Zoosystematica rossica67661
0084-5604Vestnik zoologii / Akademii︠a︡ nauk Ukrainskoĭ SSR, Institut zoologii672376
0043-0439Journal of the Washington Academy of Sciences66460391
0003-4541Annales zoologici / Polska Akademia Nauk, Instytut Zoologiczny66133651
0004-2110Arkiv för zoologi / utgivet af K. Svenska vetenskaps-akademien658599
0035-8894Transactions of the Royal Entomological Society of London65549576
0915-5805Japanese journal of entomology64562096
0013-8878The Entomologist645142
0007-4853Bulletin of entomological research63361197
0375-099XRecords of the Indian Museum a journal of Indian zoology ed. by the Director, Zoological Survey of India63021334
1326-6756Australian journal of entomology629629100
0013-8770880-02 Konchū = Kontyū62561699
0217-2445The Raffles bulletin of zoology62257192
0372-1426Transactions of the Royal Society of South Australia, Incorporated62245072
0079-8835Memoirs of the Queensland Museum62037360
0003-4150Annales de parasitologie humaine et comparée61235558
0018-0130Proceedings of the Helminthological Society of Washington60458897
0015-4040The Florida entomologist602601100
0077-7749Neues Jahrbuch für Geologie und Paläontologie. Abhandlungen60214624
1066-5234The journal of eukaryotic microbiology60157295
0031-0220Paläontologische Zeitschrift6015810
0567-7920Acta palaeontologica Polonica59957896
0032-3780Polskie pismo entomologiczne. Bulletin entomologique de Pologne590285
0027-4100Bulletin of the Museum of Comparative Zoology at Harvard College58144476
0042-3211The Veliger57827447
0181-0626Bulletin du Muséum national d'histoire naturelle. Section A, Zoologie, biologie et écologie animales57456498
0068-547XProceedings of the California Academy of Sciences57326146
0035-6387Rivista di parassitologia56620
0003-5092Annotationes zoologicae Japonenses / auspiciis Societatis Zoologicae Tokyonensis seriatim editae = Nihon dōbutsugaku ihō56254597
0036-7575Mitteilungen der Schweizerischen entomologischen Gesellschaft = Bulletin de la Société entomologique suisse56231
0251-074XRevue de zoologie africaine560183
0373-9465Folia entomologica Hungarica = Rovartani közlemények55561
0206-0477880-01 Trudy Zoologicheskogo instituta = Travaux de l'Institut zoologique de l'Académie des sciences de l'URSS / Akademii︠a︡ nauk Soi︠u︡za Sovetskikh Sot︠s︡ialisticheskikh Respublik55420
1445-5226Invertebrate systematics550550100
0307-6970Systematic entomology53752698
0020-1804Insecta matsumurana53651496
0278-0372Journal of crustacean biology : a quarterly of the Crustacean Society for the publication of research on any aspect of the biology of crustacea531531100
0165-0424Aquatic insects525525100
1051-8932Bulletin of the Brooklyn Entomological Society52331
0013-8711Entomologica scandinavica52251398
0013-8789Journal of the Entomological Society of Southern Africa51539276
0323-7087Zoologische Jahrbücher. Abteilung für Systematik, Geographie und Biologie der Tiere51317634
0007-4977Bulletin of marine science51039778