A Challenge To Linked Data Developers

Back in November, following the interest in Linked Data which had been discussed at a CETIS 2009 Conference I wondered whether it was Time To Experiment With DBpedia?

The following month I attended the Online Information 2009 conference. As I described in a post on the Highlights of Online Information 2009: Semantic Web and Social Web it was clear to me that “ #semanticweb was the highlight & relevant for early mainstream“.  A blog post which provided the LIS Research Coalition “review” of Online 2009 was in agreement: “sessions on the semantic web gave the impression that those in library and information science related roles are now beginning to consider the exploitation of data to data links“.

However a concern I raised with Ian Davis,  CTO of Talis UK following his keynote talk on “The Reality of Linked Data” was the danger of overhyping expectations; something I feel is very relevant in light of the perceived failure of the Semantic Web to live up to the potential of evangelists in the early years of the last decade.  Has, for example, the “new form of Web content that is meaningful to computers will unleash a revolution of new possibilities” described in the Semantic Web article published in Scientific America (and also available from Ryerson.ca) in May 2001 arrived? I think not.

There is a danger, I fear, that the renewed enthusiasm felt by increasing numbers of developers will not be shared by managers and policy makers – leading to interesting pilots and prototypes which do not necessarily become deployed in a mainstream service environment.

A suggestion I made to a number of Linked Data experts at the Online Information 2009 conference was to demonstrate the value of Linked Data not by providing examples in niche subject areas (e.g. chemistry) but by taking an example which everyone can understand.

In my post Time To Experiment With DBpedia? I used the DBpedia Faceted Browser to search for information about UK Universities – in the example I searched for UK Universities which were founded in 1966. But this wasn’t demonstrating how Linked Data can be used to join information which have different underlying structures.

My challenge to Linked Data developers is to make use of the data stored in DBpedia (which is harvested from Wikipedia) to answer the query “Which town or city in the UK has the highest proportion of students?“.  This would involve processing the set of UK Universities, finding all Universities from the same town or city, recording the total number of students  and then, from the town/city entries in DBpedia, finding the total population in order to identify the town or city with the largest proportion of students.

I’m not too concerned about some of the edge cases (i.e. the differences between the City of London and Greater London or the Universities with campuses in several locations).  Rather I want to know:

  • Can Linked Data solve this problem (from a theoretical perspective)?
  • Is DBpedia able to solve this problem (from a theoretical perspective)?
  • How difficult is it to solve the problem (is it a trivial 1 line SPARQL query or would it require several months of work?)

 Any takers?  And note the answer must be provided using DBpedia – asking your friends on Twitter is cheating!

1 Comment

  1. If you want authentic student population stats, HESA is probably the best place to go – eg http://www.hesa.ac.uk/index.php?option=com_datatables&Itemid=121&task=show_category&catdex=3#institution

    The data is in XLS spreadsheets, which means it’s difficulty to interrogate directly. There’s also the issue of doing a trivial mapping from the institution name to the location, although of course this will be misleading for your percentage stats if the reported student numbers represent total enrollment across multiple campuses in different towns/cities.



  1. uberVU - social comments - Social comments and analytics for this post... This post was mentioned on Twitter by briankelly: A Challenge To #LinkedData Developers: Use …
  2. Mulling Over =datagovukLookup() in Google Spreadsheets « OUseful.Info, the blog… - [...] example SPARQL queries too… I also note that no Linked Data folk appear to have picked up on Brian …
  3. HotStuff 2.0 » Blog Archive » Word of the Day: “dbpedia” - [...] Challenge To Linked Data Developers [web link]UK Web Focus (12/Feb/2010)“…time to experiment with dbpedia the following month [...]
  4. Response To My Linked Data Challenge « UK Web Focus - [...] A Challenge To Linked Data Developers [...]
  5. Cultural Heritage » Blog Archive » Elsewhere on UKOLN Blogs: February 2010 - [...] A Challenge To Linked Data Developers [...]
  6. “We Have the Highest Proportion of Students!” « UK Web Focus - [...] was the background to my recent “Challenge To Linked Data Developers” in which I asked “Which town or city …
  7. Linked Data and the Leaders’ Debate – My Challenge « OUseful.Info, the blog… - [...] Data community baiting them to demonstrate some of the utility of the Linked Data approach (e.g. A Challenge To …
  8. Bathcamp evening meetup #11: Data Driven | Bathcamp - [...] You can keep track of his experiments with data at http://kitwallace.posterous.com and he’ll be talking about some of these on …
  9. Interoperability – semantic « all things cataloged - [...] striking example for the difficulty of semantic interoperability is a Linked Data challenge which sought to answer the question: “Which …
  10. DBPedia and the Relationships Between Technical Articles « UK Web Focus - [...] on Getting information about UK HE from Wikipedia which explores some of the ideas I discussed on A Challenge To Linked …

Submit a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>