Formats for my Papers

The papers I’ve written which have been published in peer-reviewed journals, conference proceedings or have been included in other types of publications have been listed on my papers page on the UKOLN Web site since my first papers were published in 1999. More recently I have made use of the University of Bath’s institutional repository –  OPUS.

Wherever possible I have tried to provide access to the paper itself. But what formats should I provide?  The papers are initially written using MS Word and a PDF version is submitted to the publishers.  I normally try to provide access to both formats, and also create a HTML version of the paper.  The MS Word version is the master source, and so is the richest format; the PDF version provides the ‘electronic paper’, which preserves the page fidelity and the HTML format is the most open and reusable format.  So all three formats have their uses.

But none of these formats are particularly ’embeddable’. And even the HTML format is normally trapped within the host Web site. The HTML file also contains navigational elements in addition to the contents of the paper.

Shouldn’t the full contents of papers be provided in an RSS format, allowing the content to be easily embedded elsewhere?  And wouldn’t use of RSS enable the content to be reused in interesting ways?

Creating an RSS Format for a Paper

As an experiment I have created an RSS file for my paper on “Deployment Of Quality Assurance Procedures For Digital Library Programmes” which I wrote with Alan Dawson and Andrew Williamson for the IADIS 2003 conference.

As well as the MS Word and PDF formats of the paper I had also created a HTML version. The process for creating the RSS file was to copy and paste contents of the HTML file (omitting navigation elements of the page) into a WordPress blog. I then viewed the RSS file using the WordPress RSS view of a page and copied this RSS file to the UKOLN Web site.

Using the RSS Format

Display of RSS view of paper in Netvibes My first test was to add the RSS version of the paper to Netvibes.  As you can see the Netvibes RSS viewer successfully rendered the page.

It should be noted, however, that internal anchors (i.e. links to the references) did not link to the references within the RSS view, but back to the original paper.

I also tried FeedBucket, another Web-based RSS reader. In this case, as can be seen, the tool only displayed the first 500 characters or so of the paper. This seems to be a feature of a number of RSS tools which only provide a summary of the initial content of an RSS feed, with a link being provided for the full content.

Wordle View of PaperSince the content of the paper is available without the navigational elements and other possibly distracting content which may be provided on a HTML page, it is possible to analyse the contents of the paper. For this I used Wordle – if you wish you can view the Wordle cloud for the paper.

Should We Be Doing This?

Should we be providing access to papers in a mature and widely used format which allows the content to be reused in other environments using a wide range of readily available technologies?  And which also allows the content to be processed and analysed using simple-to-use tools such as Yahoo Pipes?

I think we should. But perhaps publishers will think differently, as they are more likely to seek to maintain tight control over papers if the copyright has been assigned to them. But is this necessarily the case?  My most recent paper, “Developing Countries; Developing Experiences: Approaches to Accessibility for the Real World” will be presented at the W4A 2010 on 26-27 April 2010.  We have recently completed the copyright form and I’ve noticed the following information on the author rights:

The right to post author-prepared versions of the Work covered by the ACM copyright in a personal collection on their own home page, on a publicly accessible server of their employer and in a repository legally mandated by the agency funding the research on which the Work is based. Such posting is limited to noncommercial access and personal use by others, and must include the following notice both embedded within the full text file and in the accompanying citation display as well:
“© ACM, (YEAR). This is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution …

Hmm. So can I make the paper available in an RSS format as long as I include the ACM copyright statement?