A decade ago (OMG, that can't be right, an actual decade ago) I created "iSpecies", a simple little tool to mashup a variety of data from GBIF, NCBI, Yahoo, Wikipedia, and Google Scholar to create a search engine for species. It was written in PHP, relied on some degree of *cough* web scraping to get its data, and was a bit of a toy (although that didn't stop me complaining that it could do more than EOL at the time). Eventually I got sick of dealing with Google Scholar constantly changing it's HTML and blocking IP addresses to stop people harvesting data (I once managed to get my entire campus blocked), or services disappearing such as Yahoo's image search, and I eventually pulled the plug on it.
Fun as this was, there's a bigger problem with iSpecies and that's that it is a "mashup". I'm simply grabbing data from different sources and redisplaying it. What I really want is what has been described as a "mashup" (awful term, don't use it), that is, I want to combine the data so that it is more than the sum of its parts. For example, some of the data could be cross linked (especially if add a few more sources and we drill down a bit). Some of the papers discovered by CrossRef may include original descriptions, or may be the source of some of the points plotted on the GBIF map. Some may include the phylogenies used to build the Open Tree of Life tree. In order to build a data mashup instead of a web mashup we need to operate at the level of data rather than just human-readable web pages. That is the next thing I'd like to work on, and in many ways it shouldn't be a big leap. The new iSpecies was fairly easy to create because we now have a bunch of web services that all speak JSON. It's a small step from JSON to JSON-LD (especially if the JSON-LD is constructed with reuse in mind). So while it's nice to see iSpecies back, there's a much more interesting next step to think about.