Saturday, May 06, 2006

Updating ants

A triple store for ants is all very well, but it contains just the information available when the triple store was created. What about updating it? What about doing this automatically? Here are some ideas:

Connotea provides semantically rich RSS feeds. We could subscribe to a feed using a tag (such as Formicidae), and extract recent posts. Could use HTTP conditional GET, or parse the Connotea feed and use XPath to extract references more recent than a given date. Connotea makes extensive use of RDF in their RSS feeds, so it's easy to dump this into the triple store.
uBio's taxononmically intelligent RSS feed reader could be used to monitor publications on ants (e.g., Formicidae). uBio uses RSS 2.0, which doesn't include RDF (see Wikipedia entry for RSS). One option would be to parse the RSS and see what we can extract from the links (e.g., if they contain DOIs, are Ingenta feeds, etc.). If there are DOIs we could use CrossRef's OpenURL lookup. Or we could use the Connotea Web API. We'd upload the URLs, and get Connotea to see what it can do with them, then we make use of their RSS feed. This also makes the information available to everybody for tagging.

We could also track new sequences in GenBank (to do).

1 comment:

Donat Agosti said...

To update the ant information is a straight forward thought, and we actually should make this happen.
With Norm, we are planning to (re)activate an alert system for new taxa we enter to the Hymenoptera Name Server. We also plan to make all the relevant literature accessible through an option to retrieve it in Endnote format. This includes also the idea to add all our references to connotea.
So, the question to Norm would be, whether he could automate the feed into connotea.