In a recent post on “Surveying Russell Group University Use of Google Scholar Citations” I used Google Scholar Citation’s search facility to audit the numbers of researchers in the twenty Russell Group Universities who have claimed profiles on the service.

Looking at my own host institution, which is a member of the 1994 Group, at the time of writing there are 33 profiles for a search for the University of Bath but only 23 profiles for a search for the “University of Bath”.  We can see that the findings differ depending on the search syntax, such as whether the search term in enclosed in quotes or not.

There is therefore a need to be explicit about the way in which the searches are constructed in order to ensure that findings are reproducible.  In previous surveys I have tried to document the survey methodology in the text of the blog posts but it has occurs to me that the specific details may be overlooked.  I therefore feel that further surveys should include explicit details of the survey paradata, a term which is defined in Wikipedia as “data about the process by which the survey data were collected“.

The blog posts I have published have, wherever possible, provided live links to the services used to gather the data. Such links may provide parameters which may differ depending on factors such as the browser environment you are using,  The hyperlink used for the search described above, for example, is:

As described in the Google documentation:

The hl parameter specifies the interface language (host language) of your user interface. To improve the performance and the quality of your search results, you are strongly encouraged to set this parameter explicitly.

This is a simple query. However the Google search box in my browser produces the following URL as a result of a search for google scholar citations:

In order to ensure that a rich description of the survey environment is available, my intention is that surveys published in future will contain survey paradata details along the lines illustrated in the following table, which describes the survey published in the recent post.

Details Description Data Note
Search term The official name of the host institution. Column 1 Name is not included in quotes.
Date The date of the survey. 24 November 2011 If the survey is carried out across several days, this should be documented.
Search service Google Scholar Citations service. If, for example, a UK version of the service is released, this should be documented.
Browser environment Name & version of browser and platform. Safari v 5.5.1 running on an Apple Macintosh Include details of browser plugins if this is felt to be relevant.
Language The default language (English) is used. EN
Search options Search options selected. Used the “Search Authors” option. If additional search options are available they should be documented.
Location Details of where the survey was carried out. Search carried out in Bath, UK.
User account Information on whether surveyer was logged in. Search carried out whilst logged in to Google.
Possible problem areas
  • There may be name clashes (e.g. University of Newcastle and University of Newcastle, New South Wales).
  • Searches may include email address fields as well as name of institution

Any suggestions on things I may have missed?