Science Publishing II: RSS & XML


For the technically-minded, here’s an article (via Lockergnome) on The Role of RSS in Science Publishing: Syndication and Annotation on the Web, by Hammond, Hannay, and Lund of the Nature Publishing Group:

RSS is one of a new breed of technologies that is contributing to the ever-expanding dominance of the Web as the pre-eminent, global information medium. It is intimately connected with—though not bound to—social environments such as blogs and wikis, annotation tools such as [1], Flickr [2] and Furl [3], and more recent hybrid utilities such as JotSpot [4], which are reshaping and redefining our view of the Web that has been built up and sustained over the last 10 years and more [n1]. Indeed, Tim Berners-Lee’s original conception of the Web [5] was much more of a shared collaboratory than the flat, read-only kaleidoscope”

Read more…

RSS, which stands for, among other things, “Really Simple Syndication”, is a file format based on XML, for quickly promulgating various sorts of summary information through the web; it’s used mostly for news headlines and blogs, although it’s already being used to list the latest preprints at the astrophysics archive server. Perhaps more importantly, it’s also being used to foster discussion of these articles at sites like CosmoCoffee (RSS here) and Physics Comments (RSS here).
You can see RSS examples for my blog from the links at the top of this page. (Due to a depressing controversy over the different RSS formats, I can’t provide just a single definitive link to a “definition” of RSS, but check the appropriate footnotes in the article above, or the wikipedia.)

(To read RSS, you need a standalone program, although current versions of the Firefox browser and Thunderbird mail reader have rudimentary — very rudimentary — capabilities; the next generation of Apple’s Safari browser is meant to be RSS-aware, too. Right now, I use the great NetNewsWire on my Mac; I haven’t found a really good RSS reader for my Linux machine at work — does anyone know of one?)

In astrophysics, we are very slowly moving toward various XML formats for data interchange, since XML can very easily be used to not only contain the so-called “raw data” (such as an astronomical image) but also the accompanying (and even more so-called) “metadata”, information about the raw data (such as where in the sky the image lies, when it was taken, on what telescope, etc.). In particular, VOTable will be the underlying format for Virtual Observatories such as AstroGrid which I briefly discussed here. ESA’s Planck Surveyor satellite, in which I’m also involved, will also likely use some sort of XML underneath.

The article discusses how RSS is currently being used in science publishing (although it emphasized publishing via already-extant paper journals rather than services like the archive), in particular at Nature, where the authors are employed, what sort of protocols for metadata may be needed, and other scientific uses of RSS such as data exchange and even podcasting. (I think the article gets many aspects of the RSS version history incorrect; I also feel it’s worth explicitly mentioning the name of alpha-blogger Dave Winer, who more or less invented RSS and much of the blogging infrastructure we know of today.)

On its own, all this is just boring techie jargon. However, when combined with the Science Commons ideas from the last post, we begin to get a full model for disseminating scientific information: data and publications freely exchanged on the web, with open standards so authors and publishers don’t have to continually re-invent the appropriate wheels, with appropriate metadata so other scientists know what they’re getting, and know how to properly reference it.



One response to “Science Publishing II: RSS & XML”

  1. Alejandro Rivero avatar

    Another possibility of RSS is syncronisation of sites. I think that the future step of physcomments (still a newborn, found. 1 Dec 2004) is to hub for other “Cafe” sites. The “cafe managers” could do active fine-tuning of discussions and to make independent, decentralised development of their sites, while thehubs could provide a condensed view, for people who does not like to navigate. RSS seems a obvious syncronisation tool (althought for small scale, email commands are adequate too).
    Another utility of RSS is that it is XML. Thus it fits well in XML->TeX converters, and “book versions” of a site can be more easily dumped via the RSS feeds. I will try to elaborate on it this year.