Friday, August 27, 2021

JSON-LD in the wild: examples of how structured data is represented on the web

I've created a GitHub repository so that I can keep track of the examples of JSON-LD that I've seen being actively used, for example embedded in web sites, or accessed using an API. The repository is https://github.com/rdmpage/wild-json-ld. The list is by no means exhaustive, I hope to add more examples as I come across them.

One reason for doing this is to learn what others are doing. For example, after looking at SciGraph's JSON-LD I now see how an ordered list can be modelled in RDF in such a way that the list of authors in a JSON-LD document for, say a scientific paper, is correct. By default RDF has no notion of ordered lists, so if you do a SPARQL query to get the authors of a paper, the order of the authors returned in the query will be arbitrary. There are various ways to try and tackle this. In my Ozymandias knowledge graph I used "roles" to represent order (see Figure 2 in the Ozymandias paper). I then used properties of the role to order the list of authors.

Another approach is to use rdf:lists (see RDF lists and SPARQL and Is it possible to get the position of an element in an RDF Collection in SPARQL? for an introduction to lists). SciGraph uses this approach. The value for schema:author is not an author, but a blank node (bnode), and this bnode has two predicates, rdf:first and rdf:rest. One points to an author, the other points to another bnode. This pattern repeats until we encounter a value of rdf:nil for rdf:rest.

This introduces some complexity, but the benefit is that the JSON-LD version of the RDF will have the authors in the correct order, and hence any client that is using JSON will be able to treat the array of authors as ordered. Without some means of ordering the client could not make this assumption, hence the first author in the list might not actually be the first author of the paper.