The Importance of Data

This year has been a great year for sport, especially in London. But as well as the success of the London Olympics and the Paralympics we have also seen a growth in interest in data, which has gone beyond ‘data scientists’ and is now of mainstream interest.

We saw early examples of general interest in data when the MPs’ expenses scandal surfaced in the Daily Telegraph back in 2009. However the availability of the expenses data on the Guardian platform generated new life for this story and saw a widening of interest particularly amongst developers with an interest in politics. We saw an example of this in Tony Hirst’s series of posts in which, as summarised in a post on My Guardian OpenPlatform API’n’Data Hacks’n’Mashups Roundup, he provided a number of visualisations of expenses claims.

The MPs’ expenses story raised interest in data journalism – and it is interesting to note that the Data driven journalism entry in Wikipedia was created as recently as 4 October 2010 . However this year’s summer of sport seems to have generated interest in data from the general public, beyond those who read the broadsheets.

According to a post on the Twitter published on 10 August 2012 there were “more than 150 million Tweets about the Olympics over the past 16 days“. The popularity of Twitter during the Olympics Games provided much content which could be analysed (there were over 80,000 tweets per minute when Usain Bolt won the 100m). But beyond Twitter there was also interest in analysis of data associated with the athletes’ performance and their achievements, as recorded by the medals they won.

In the higher education sector there has been an awareness of the importance of analysis of data for some time. Back in December 2011 in a post on “My Predictions for 2012” highlighted the following as an area of increasing relevance for the sector:

Learning and Knowledge Analytics ….

The ubiquity of mobile devices coupled with greater use of social applications as part of a developing cultural of open practices will lead to an awareness of the importance of learning and knowledge analytics. Just as in the sporting arena we have seen huge developments in using analytic tools to understand and maximise sporting performances, we will see similar approaches being taken to understand and maximise intellectual performance, in both teaching and learning and research areas.

We have seen a number of examples of development work in the area of learning analytics taking place this year. As can be seen for this list, staff at CETIS have been active in sharing their thoughts on developments in the area of learning analytics. Of particular interest were Sheila MacNeill’s post in which she asked Learning Analytics, where do you stand? (which generated a lively discussion); Making Sense of “Analytics” (which linked to a document on “Making Sense of Analytics: a framework for thinking about analytics“); Sheila’s 5 things from LAK12 (in which she highlighted five areas that resonate with me over the 4 days of the LAK12 conference) and herself-explanatory list of Some useful resources around learning analytics .

The Importance of a Data-driven Infrastructure

But beyond the uses which can be made of data, there will also be a need for institutions to address the issue of how they manage such data. The approaches needed in Preparing for Data-driven Infrastructure have been summarised in a JISC Observatory TechWatch report of the same name.

The background to the report is the need for institutions to manage their data more effectively and provide greater transparency for institutional business processes, ranging from institutional data such as that being provided in Key Information Sets (KIS), the detailed reporting required for the Research Excellence Framework (REF) through to Learning Analytics as described above.

The report highlights approaches which institutions can take in responding to these strategic drivers, including the needs for greater transparency in business processes, in order to adopt a more data-centric approach. The report includes a description of data-centric architectures; an overview of tools and technologies including APIs, Linked Data and NoSQL together with a review of architectural approaches which institutions will need to consider.

The report, which is available under a Creative Commons Attribution (CC-BY) licence, was commissioned by the JISC Observatory team and written by Max Hammond, a consultant who has worked widely across the higher education and research sectors.

We welcome feedback on the report which can be provided on the JISC Observatory Web site.

Twitter conversation from: [Topsy] – [SocialMention] – [WhosTalkin]