Open data posterIn a post entitled Wanted For The ODI! which I published yesterday I described the Open Data Institute’s (ODI) Community Engagement Manager post. 

Tom Heath, the Head of Research at the ODI explained how he wanted potential applicants for the post to “demonstrate your ability to understand, reach and engage an audience” in order to support “collaborative projects [which] will bring together teams of researchers and companies from across Europe to explore the latest challenges in the field of open data and create technology platforms to help policy makers, developers and startup companies understand the open data landscape and build new applications/businesses“.

But what is open data and why the interest in open data?  There is a need, I feel, to be able to provide answers to these questions to those who may not be currently engaged in work involving use of open data.

A definition of the term ‘open data’ is available from Wikipedia: “Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control”. This is a useful definition as Wikipedia is a popular reference source for people looking to find definitions of new concepts – indeed there have been 32,739 views of this article in the last 90 days.

But although this definition states that “certain data should be freely available to everyone to use and republish as they wish” it does not explain why data should be freely available. In the area of open source software, Richard Stallman has argued that  software should be “free as in speech” rather “than free as in beer“. I don’t agree with this view; rather I feel that open source software can provide business benefits by enabling others to view, use and adapt software.

I take the same view for open data. In the case of data provided by, analysed by and commented on by researchers there can be benefits in making the data open so that other researchers can validate the data and verify the analyses made of the data.

But is this also the case for institutional data? And what barriers might institutions put in place which restricts the use of others to “use and republish [data] as they wish, without restrictions from copyright, patents or other mechanisms of control“. A significant barrier will be concerns that the provision of open data will result in the loss of revenue streams for the institution. Often such issues are raised within the context of commercial organisations which may make money from data, such as publishers who licence researcher’s data, usage data, etc. But it would be a mistake to regard such barriers as being imposed only by the commercial sector.  Back in December 2010 in a post entitled “Impact, Openness and Libraries” I described how:

SCONUL [the UK academic library organisation] has been collecting and publishing statistics from university libraries for over twelve years, with the aim of providing sound information on which policy decisions can be based.

I went on to point out that:

The SCONUL data is not publicly available. It seems that the SCONUL Annual Library Statistics is published yearly – and copies cost £80.

and added that:

Perhaps more importantly in today’s climes, the closed nature of the report and the underlying data (which is closed by its price, closed by being available only to member organisations and closed by being available in PDF format) is how perceptions of secrecy goes against  expectations that public sector organisation should be open and transparent.

One approach to obtaining access to such closed data is to submit a Freedom Of Information (FOI) request. Shortly after I published by blog post, following discussions at the ILI conference Tony Hirst submitted an FOI request:

Please could you supply me with a copy of the annual statistical report made to SCONUL from the University of Bath Library for the period 2008-9

which provided access to the SCONUL data for one institution although, being in PDF format it was not well-suited for further analysis.

This example illustrates, I feel, some of the difficulties which will need to be addressed in enhancing the availability of open data in the public sector. And whilst there are technical challenges (the formats used; the metadata which describes the data sources and the workflow processes for providing access to the data) ; resourcing issues (who pays for the additional work needed); skills issues (do organisations have the technical expertise and systems needed to provide open data) and business model issues (will there be sufficient interest by others in consuming open data to justify the costs) there is also the need to consider some of the underlying political considerations regarding the growth in interest in open content. In 2005 Bill Gates described free culture advocates as a “modern-day sort of communists”. But from today’s political and economic environment might not the pressures on public sector bodies to provide open data about their activities be regarded as a neo-conservative plot aimed at the privatisation of the public sector be providing opportunities for the commercial sector to exploit business intelligence? And are we seeing examples of this in the moves from open educational resources to MOOCS, in which learning analytics seems to be becoming a valuable digital commodity?

I’d welcome responses to these concerns!