How Well-Read Are Technical Wikipedia Articles?

In a recent post on Having An Impact Through Wikipedia I suggested that it would be useful if JISC-funded synthesis reports, for example reports on emerging new standards, used Wikipedia as a means of enhancing access to such work. In the post I pointed out that “I can’t find usage statistics for the page [but] I suspect that the article [on Amplified Conference which I created] will have been read my more people than have read my various peer-reviewed papers, blog posts, etc.” In response to a request for examples of tools which provide usage statistics for Wikipedia articles Martin Greaney suggested thatIt’s quite basic, but the tool at http://stats.grok.se/ might give you enough of an idea of the traffic to certain articles in Wikipedia“.

As Lorcan Dempsey suggested in a tweetThe Wikipedia article traffic stats site mentioned in your comments is quite interesting. wonder how reliable is“. I agree and thought I would explore what the statistics tell us about Wikipedia entries for a number of areas related to Web, metadata and related standards of interest to the JISC development community.

My survey was carried out on 6 July 2010. The following table provides a link to the relevant Wikipedia article, the data the article was created (with a link to the original page for the article), my comments on the article and the usage statistics for October 2009 and June 2010 (two dates chosen to observe any significant variations).

Page Created Summary (subjective comments) Stats: Oct 2009 Stats: Jun 2010
Linked Data May 2007 Multiple concerns have been identified with this article. 5,423 8,102
HTML Jul 2001 Appears to be a well-written and comprehensive article. Includes info box so factual information is available in DBPedia. 147,357 143,386
XML Sep 2001 Appears to be a well-written and comprehensive article. Includes info box so factual information is available in DBPedia. 159,749 126,599
XSLT Feb/Jun 2002 Appears to be a very thorough and comprehensive article. Includes info box so factual information is available in DBPedia. 7,160 18,938
RSS Sep 2002 Appears to be a well-written and comprehensive article. Includes a very brief info box so factual information is available in DBPedia. 7,160 (gaps) 18,938
AJAX (programming) Mar 2005 Appears factually correct . 98,629 90,300
SAML (Security Assertion Markup Language) Sep 2004 Appears to be a very thorough and comprehensive article. 4,421 2,942
Z39.50 Oct 2004 Brief article which has been flagged as in need of improvements. 3,960 2,592
Search/Retrieve Web Service Feb 2004 Very little information provided. 506 462
Dublin Core Oct 2001 Appears factually correct though citations need improving. 7,013 7,501
METS (Metadata Encoding and Transmission Standard) Sep 2006 Appears factually correct though citations need improving. 7,236 4,573
MODS (Metadata Object Description Schema) Aug 2006 Appears to be a reasonable although succinct summary. 546 583

Extrapolating from the usage statistics for the two dates it would seem that popular articles such as HTML and XML have an annual number of views of around 1,750,000 and 1,720,000 whilst an article on a less well-known standard such as METS has an annual number of views of around 70,0000. It is perhaps surprising, in light of the high viewing figures for METS that the annual viewing figures for MODS is around 6,700. Perhaps this is due to the name clash between the METS acronym and the Mets name used to refer to the New York Mets. However there isn’t, as far as I am aware, such scope for confusion with names such as HTML, XML, SAML, etc.

What conclusion might we draw from such statistics? I would suggest that if I had an interest in ensuring that users had a good understanding of what Dublin Core is about and had access to the key sources of information then contributing to the Dublin Core Wikipedia page would be a good way of achieving that goal – after all the estimated viewing figures of around 87,000 surely can’t be ignored.

Now Matt Jukes pointed out the potential difficulties of getting content into Wikipedia. But that is a question of ‘How we go about contributing to Wikipedia?‘ rather than ‘Should we?

Can we accept that the answer to the second question should be ‘Yes‘ so that we can explore ways of addressing the first question?

5 Comments

  1. stats.grok.se should be accurate down to single hits – it gets its data directly from the Squid front-end servers.

    Reply
  2. Re: “I would suggest that if I had an interest in ensuring that users had a good understanding of what Dublin Core is about and had access to the key sources of information then contributing to the Dublin Core Wikipedia page would be a good way of achieving that goal – after all the estimated viewing figures of around 87,000 surely can’t be ignored.”

    I totally agree.

    Re: your comment about the Dublin Core Wikipedia entry, “Appears factually correct though citations need improving” – actually, that content is somewhat contentious in terms of what it emphasises as being important about DC.

    Reply
    • Thanks for the response.

      I labelled the column containing my comments “Summary (subjective comments)”. I should probably have emphasised that I wasn’t making any statement concerning the accuracy of any of the articles. If there are disagreements about the accuracy of the content or concerns over the emphasis or omitted information then all the more reason why there is a need to engage in discussions on the ongoing maintenance of the information, I feel.

      Reply
  3. wow..nice info Brian..
    i just know that survey..
    Thanks

    Reply
  4. “I would suggest that if I had an interest in ensuring that users had a good understanding of what Dublin Core is about and had access to the key sources of information then contributing to the Dublin Core Wikipedia page would be a good way of achieving that goal – after all the estimated viewing figures of around 87,000 surely can’t be ignored.”

    Having just spent the last half hour trying to navigate around the Dublin Core official website to find the desired information, I also wouldn’t underestimate the impact that a site redesign would have on achieving similar goals. Or have we gotten to the point that we’d *prefer* people go to Wikipedia than institutional sites b/c Wikipedia is navigationally superior? It takes at least two clicks from the DC home page to find a list of the DC element set, whereas that list is just a slight scroll down on the Wikipedia page. So perhaps we should also be examining why Wikipedia is so easy to use and incorporate those principles into the official websites.

    Reply

Trackbacks/Pingbacks

  1. (pluri)TAL / ILPGA [U. Paris 3] - [...] Report).The Semantic Web, Linked and Open Data: A Briefing Paper.Un post de Brian Kelly sur l’utilisation de Wikipedia comme …
  2. How Can We Assess the Impact and ROI of Contributions to Wikipedia? « UK Web Focus - [...] useful if JISC-funded project work used Wikipedia as a means of disseminating their knowledge and went on to provide …
  3. DBPedia and the Relationships Between Technical Articles « UK Web Focus - [...] recently wrote a blog post in which I asked How Well-Read Are Technical Wikipedia Articles? The statistics I quoted seemed …
  4. The Blog as a Narrative or the Post as a Self-Contained Item « UK Web Focus - [...] Assess the Impact and ROI of Contributions to Wikipedia? (published on 27 September 2010) and How Well-Read Are Technical Wikipedia Articles? (published in 8 July 2010). Since …
  5. Supporting Use of Wikimedia Across the UK Higher Education Sector « UK Web Focus - […] A benchmarking post which, in July 2010, provided some answers to the question How Well-Read Are Technical Wikipedia Articles? […]
  6. How Are You Using Wikipedia? « UK Web Focus - […] and encouraged use of Wikipedia (in blog posts such as Having An Impact Through Wikipedia, How Well-Read Are Technical Wikipedia Articles? and How Can We …

Submit a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>