Wikipedia as the Front Matter to all Research
A session at the recent Wikimania conference provided an opportunity for discussion on the topics: “The fount of all knowledge – wikipedia as the front matter to all research“. The abstract describes how:
This discussion focuses on how Wikipedia could become the entry or discovery point to all significant research for the general public, and for scholars who are working just outside of the topic of interest. For most people, even researchers from closely related areas, summaries and explanations of a piece of research can be a crucial means both to discover and to begin to get into a new piece of research.
Currently overviews of research topics are supported through two mechanisms: reviews and “front matter” content. A review is a systematic summary of a field, written by an expert. These go out of date quickly, particularly in rapidly moving areas of research. Front matter is “News and Views” pieces, often found at the “front” of scientific journals that explain newly published research and put it in context. This often includes a discussion of explaining how the research is an important advance and its broader societal implications.
Both of these functions could easily be provided in a more up to date and scalable manner by tapping into a global community of experts. Wikipedia articles are often the top web search result for initial queries in many research areas and these articles are a major source of traffic for scientific journals. As the first port of call for many users of research and a significant discovery route the potential for Wikipedia as a form of dynamic, expertly curated “front matter” for the whole research literature is substantial. This facilitated discussion session will focus on how this role could be enhanced, what is currently missing and what risks exist in taking this route.
Reading this I wondered about the extent to which Wikipedia articles currently link to papers hosted in institutional repositories.
In order to explore this question I made use of Wikipedia’s External links search tool to monitor the number of links
to from Wikipedia pages from to institutional repositories provided by the Russell Group universities.
The survey was carried out on 28 August 2014 using the service. Note that the current finding can be obtained by following the link in the final column.
|Institutional Repository Details||Nos. of links
Institution: University of Birmingham
Repository used: eprint Repository (http://eprints.bham.ac.uk/)
Institution: University of Bristol
Repository used: ROSE (http://rose.bris.ac.uk/)
Institution: University of Cambridge
Repository used: Dspace @ Cambridge (http://www.dspace.cam.ac.uk/)
Institution: Cardiff University
Repository used: ORCA (http://orca.cardiff.ac.uk/)
Institution: University of Durham
Repository used: DRO (http://dro.dur.ac.uk/)
Institution: University of Edinburgh
Repository used: ERA (http://www.era.lib.ed.ac.uk/)
Institution: University of Exeter
Repository used: ERIC (https://eric.exeter.ac.uk/repository/)
Institution: University of Glasgow
Repository used: Enlighten (http://eprints.gla.ac.uk/)
Institution: Imperial College
Repository used: Spiral (http://spiral.imperial.ac.uk/)
Institution: King’s College London
Repository used: King’s Research Portal (https://kclpure.kcl.ac.uk/portal/)
Institution: University of Leeds
Repository used: White Rose Research Online (http://eprints.whiterose.ac.uk/)
Institution: University of Liverpool
Repository used: University of Liverpool Research Archive (http://research-archive.liv.ac.uk/)
Repository used: LSE Research Online (http://eprints.lse.ac.uk/)
Institution: University of Manchester
Repository used: eScholar (https://www.escholar.manchester.ac.uk/)
Institution: Newcastle University
Repository used: Newcastle Eprints (http://eprint.ncl.ac.uk/)
Institution: University of Nottingham
Repository used: Nottingham Eprints (http://eprints.nottingham.ac.uk/)
Institution: University of Oxford
Repository used: ORA (http://ora.ouls.ox.ac.uk/)
Institution: Queen Mary, University of London
Repository used: QMRO (https://qmro.qmul.ac.uk/)
Institution: Queen’s University Belfast
Repository used: QUB Research Portal (http://pure.qub.ac.uk/portal/)
Institution: University of Sheffield
Repository used: The University of Sheffield also uses the White Rose repository which is also used by Leeds and York. See the Leeds entry for the statistics.
Institution: University of Southampton
Repository used: eprints.soton (http://eprints.soton.ac.uk/)
Institution: University College London
Repository used: UCL Discovery (http://discovery.ucl.ac.uk/)
Institution: University of Warwick
Repository used: WRAP (http://wrap.warwick.ac.uk/)
Institution: University of York
Repository used: The University of York uses the White Rose repository which is also used by Leeds and Sheffield. See the Leeds entry for the statistics.
- The URL of the repositories is taken from the OpenDOAR service.
- Since the universities of Leeds, Sheffield and York share a repository the figures are provided in the entry for Leeds.
- A number of institutions appear to host more than one research repository. In such cases the repository which appears to be the main research repository for the institution is used.
The Survey Methodology
It should be noted that this initial survey does note pretend to provide an answer to the question “How many research papers hosted by institutional repositories provided by Russell group universities are cited in Wikipedia articles?” Rather the survey reflects the use of this blog as an ‘open notebook’ in which the initial steps in gathering evidence are documented openly in order to solicit feedback on the methodology. This post also documents flaws and limitations in the methodology in order that others who may wish to use similar approaches are aware of the limitations. Possible ways in which such limitations can be addressed are given and feedback is welcomed.
In particular it should be noted that the search engine used in the survey covers all public pages on the Wikipedia web site and not just Wikipedia articles. It includes Talk pages and user profile pages.
In addition the repository web sites include a variety of resources and not just research papers; for example it was observed that some user profile pages for researchers provide links to their profile on their institutional repository.
It was also noticed that some of the files linked to from Wikipedia were listed in the search results as PDFs. Since it seems likely that PDFs referenced on Wikipedia which are hosted on institutional repositories will be research papers a more accurate reflection on the number of research papers which are cited in institutional repositories may be obtained by filtering the findings to include only PDF results.
In addition if the findings from the search tool were restricted to Wikimedia articles only (and omitted Talk pages, user profile pages, etc.) we should get a better understanding of the extent to which Wikipedia is being used as the “front matter” to research hosted in Russell group university institutional repositories.
If any Wikipedia developers would be interested in talking up this challenge, this could help to provide a more meaningful benchmark which could be useful in monitoring trends.
Policy Implications of Encouraging Wikipedia to Act as the Front Matter to Research
There are risks when gathering such data that observers with vested interests will seek to make too much of the findings if they suggest a league table, particularly if there seem to be runaway leaders.
However as can be seen from the accompanying pie chart in this case no single institutional repository has more than 17% of the total number of links (and remember that these figures are flawed due to the reasons summarised above).
However there will be interesting policy implications if universities agree with the suggestion that Wikipedia can act as “the front matter to all research”, especially if links from Wikipedia to the institution’s repository results in increased traffic to the repository. Another way of characterising the proposal would be to suggest that Wikipedia can act as “the marketing tool to an institution’s research outputs”.
Earlier today I came across an article entitled “So who’s editing the SNHU Wikipedia page?” which described how analysis of editing patterns and deviations from the norm may be indicative of inappropriate Wikipedia editing strategies, such as pay-for updates to institutional Wikipedia articles.
If we wish to see Wikipedia acting as the front matter to research provided by the university sector should we be seeking to develop a similar statement on how we will do this whilst ensuring that we act in accordance with Wikipedia’s policies and guidelines? Of course the challenge would then be to identify what the appropriate best practices should be.