iPhylo: Entity Explosion: bringing Wikidata to every website

Roderic D. M. Page

Tuesday, August 25, 2020

Entity Explosion: bringing Wikidata to every website

A week ago Toby Hudson (@tobyhudson) released a very cool Chrome (and now Firefox) extension called Entity Explosion. If you install the extension, you get a little button you can press to find out what Wikidata knows about the entity on the web page you are looking at. The extension works on web sites that have URLs that match identifiers in Wikidata. For example, here it is showing some details for an article in BioStor (https://biostor.org/reference/261148). The extension "knows" that this article is about the Polish arachnologist Wojciech Staręga.

But this is a tame example, see what fun Dario Taraborelli (@ReaderMeter) is having with Toby's extension:

"What every language and every other site calls this entity".

This browser extension (available for Chrome and Firefox) is one of the best user-facing @Wikidata applications I've seen so far: ID / authority file mappings at the tip of your fingers. Absolutely brilliant. https://t.co/1zXPDM48aw
— Dario Taraborelli (@ReaderMeter) August 24, 2020

There are some limitations. For instance, it requires that the web site URL matches the identifier, or more precisely the URL formatter for that identifier. In the case of BioStor the URL formatter URL is https://biostor.org/reference/$1 where $1 is the BioStor identifier stored by Wikidata (e.g., 261148). So, if you visit https://biostor.org/reference/261148 the extension works as advertised.

However, identifiers that are redirects to other web sites, such as DOIs, aren't so lucky. A Wikidata item with a DOI (such as 10.1371/JOURNAL.PONE.0133602) corresponds to the URL https://doi.org/10.1371/JOURNAL.PONE.0133602, but if you click on that URL eventually you get taken to https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0133602, which isn't the original DOI URL (incidently this is exactly how DOIs are supposed to work).

So, it would be nice if Entity Explosion would also read the HTML for the web page and attempt to extract the DOI from that page (for notes on this see https://www.wikidata.org/wiki/Wikidata_talk:Entity_Explosion#Retrieve_identifiers_from_HTML_<meta>_tag), which means it would work on even more webs sites for academic articles.

Meantime, if you use Chrome of Firefox as your browser, grab a copy and discover just how much information Wikidata has to offer.