Wednesday, August 26, 2020

Personal knowledge graphs: Obsidian, Roam, Wikidata, and Xanadu

I stumbled across this tweet yesterday (no doubt when I should have been doing other things), and disappeared down a rabbit hole. Emerging, I think the trip was worth it.


Markdown wikis

Among the tools listed by @zackfan01 were Obsidian and Roam, neither of which I heard of before. Both Obsidian and Roam are pitched as "note-taking" apps, they are essentially personal wikis where you write text in Markdown and use [[some text goes here]] to create links to other pages (very like a Wiki). Both highlight backlinks, that is, clearly displaying "what links here" on each page, making it easy to navigate around the graph you are creating by linking pages. Users of Obsidian share these graphs in Discord, rather like something from Martin MacInnes' novel "Gathering Evidence". Personal wikis have been around for a long time, but these apps are elegantly designed and seem fun to use. Looking at these apps I'm reminded of my earlier post Notes on collections, knowledge graphs, and Semantic Web browsers where I moaned about the lack of personal knowledge graphs that supported inference from linked data. I'm also reminded of the Blue Planet II, the BBC, and the Semantic Web: a tale of lessons forgotten and opportunities lost where I constructed an interactive tool to navigate BBC data on species and their ecology (you can see this live at, and the fun to be had from simply being able to navigate around a rich set of links. I imagine these Markdown-based wikis could be a great way further explore these ideas.


Personal and global knowledge graphs

Then I began thinking about what if the [[page links]] in these personal knowledge graphs were not just some text but, say, a Wikidata identifier (of the form "Qxxxxx")? Imagine that if you were writing notes on say, a species, you could insert the Wikidata Qid and you would get a pre-populated template that comes with some facts from Wikidata, and you could then use that as a starting point (see for example Toby Hudson Entity Explosion I discussed earlier). Knowing that more and more scholarly papers are being added to Wikidata, this means you could also add bibliographic citations as Qids, fetching all the necessary bibliographic information on the fly from Wikidata. So your personal knowledge graph intersects with the global graph.



Now, I've not used Roam, but anyone who has is likely to balk at my characterisation of it as "just" a Markdown wiki, because there's more going on here. The Roam white paper talks about making inferences from the text using reasoning or belief networks, although these features don't seem to have had much uptake. But what really struck me as I explored Roam was the notion of not just linking to pages using the [[ ]] syntax, but also linking to parts of pages (blocks) using (( )). In the demo of Roam there are various essays, such as Paul Graham's The Refragmentation, and each paragraph is an addressable block that can be cited independently of the entire essay. Likewise, you can see what pages cite that block.

 Now in a sense these are just like fragment identifiers that we can use to link to parts of a web page, but there's something more here because these fragments are not just locations in a bigger document, they are the components of the document.


This strikes me as rather like Ted Nelson's vision of Xanadu, where you could cite any text at any level of granularity, and that text would be incorporated into the document you were creating via transclusion (i.e., you don't include a copy of the text, you include the actual text). In the context of Roam, this means you have the entire text you want to cite included in the system, so you can then show chunks of it and build up a network of ideas around each chunk. This also means that the text being worked on becomes part of the system, rather than remaining isolated, say, as a PDF or other representation. This also got me thinking about the Plazi project, where taxonomic papers are being broken into component chunks (e.g., figures, taxonomic descriptions, etc.) and these are then being stored in various places and reassembled - rather like Frankenstein's monster - in new ways, for example in GBIF (e.g., or Species-ID (see doi:10.3897/zookeys.90.1369 ). One thing I've always found a little jarring about this approach is that you lose the context of the work that each component was taken from. Yes, you can find a link to the original work and go there, but what if you could seamlessly click on the paragraph or figure see them as part of the original article? Imagine we had all the taxonomic literature available in this way, so that we can cite any chunk, remix it (which is a key part of floras and other taxonomic monographs), but still retain the original context?


To come back full circle, in some ways tools like Obsidian and Roam are old hat, we've had wikis for a while, the idea of loading texts into wikis is old (e.g., Wikisource), backlinks are nothing new, etc. But there's something about seeing clean, elegant interpretations of these ideas, free of syntax junk, and accompanied by clear visions of how software can help us think. I'm not sure I will use either app, but they have given me a lot of food for thought.