Profiling Use of Third-Party Research Repository Services


How significant is use of third-party repository services?

In a recent post I explained Why I’m Evaluating ResearchGate. In the post I summarised the reasons why I felt that could provide an additional service for depositing research papers which would complement Opus, the University of Bath institutional repository. But what others services might also be relevant? And which services are hosting the largest numbers of research papers?

In order to seek answers to these questions, I used Google to provide a measure of the size of a number of hosting services for PDFs and the number of PDFs they host. The services I analysed were:

  • This site is described in Wikipedia as “a social networking site for scientists and researchers to share papers, ask and answer questions, and find collaborators. The site has been described as a mash-up of “Facebook, Twitter and LinkedIn” that includes “profile pages, comments, groups, job listings, and ‘like’ and ‘follow’ buttons”. Members are encouraged to share raw data and failed experiment results as well as successes, in order to avoid repeating their peers’ scientific research mistakes.
  • This site is described in Wikipedia as “a platform for academics to share research papers. It was launched in September 2008. Currently the site is approaching 2 million registered users.[2] The platform can be used to share papers, monitor their impact, and follow the research in a particular field.
  • Thus site is described in Wikipedia as “a desktop and web program for managing and sharing research papers,[2] discovering research data and collaborating online. It combines Mendeley Desktop, a PDF and reference management application (available for Windows, Mac and Linux) with Mendeley Web, an online social network for researchers.[3][4][5] Mendeley requires the user to store all basic citation data on its servers – storing copies of documents is at the user’s discretion“.
  • This site is described in Wikipedia as “based on the principle of social bookmarking [the service] is aimed to promote and to develop the sharing of scientific references amongst researchers. In the same way that it is possible to catalog web pages (with Furl and or photographs (with Flickr), scientists can share information on academic papers with specific tools (like CiteULike) developed for that purpose“.
  • This site is described in Wikipedia as “a document-sharing website that allows users to post documents of various formats, and embed them into a web page using its iPaper format“.

Many researchers will probably be familiar with the first four services listed. The fifth service,, is included in order to explore whether a general-purpose PDF repository service could have a role to play in supporting the sharing of research publications.

Findings for the Coverage of the Services

Google was used in order to provide an estimate of the coverage of the services, including the total number of resources which have been indexed by Google and the number of PDF files. The findings are given in the following table. Note that the figures were initially collected on 6 February 2013. In order to check the volatility of the findings the searches were repeated on 11 February.

Search for Search Term Nos. of results Date
Total number of resources 55,300,000   6 Feb 2013
56,100,000 11 Feb 2013
Total number of PDF files filetype:pdf   2,980,000   6 Feb 2013
  2,910,000 11 Feb 2013
Total number of resources 12,500,000   6 Feb 2013
 12,400,000 11 Feb 2013
Total number of PDF files filetype:pdf           4,930   6 Feb 2013
         4,740 11 Feb 2013
Total number of resources   3,310,000   6 Feb 2013
  3,150,000 11 Feb 2013
Total number of PDF files filetype:pdf          3,840   6 Feb 2013
         4,020 11 Feb 2013
Total number of resources  35,600,000   6 Feb 2013
 35,700,000 11 Feb 2013
Total number of PDF files filetype:pdf              244   6 Feb 2013
               30 11 Feb 2013
Total number of resources   61,300,000   6 Feb 2013
166,000,000 11 Feb 2013
Total number of PDF files filetype:pdf                  – 6 Feb 2013
371,000,000 11 Feb 2013
Total number of resources 10,300,000   6 Feb 2013
26,100,000 11 Feb 2013
Total number of PDF files filetype:pdf        48,800   6 Feb 2013
       48,800 11 Feb 2013

It seems that Scribd hosts a very large number of resources (although a finding of 3 PDF resources originally found was discarded as the results seemed to be unreliable).

However since Scribd is a general purpose repository service, it was felt that ResearchGate provides a repository of a large number of PDFs resources which are more relevant for researchers. In light of this confirmation of the popularity of Researchgate an additional survey was carried out which reported on use of the service across Russell Group universities.

Findings for Institutional Use of and Researchgate

On 1 August 2012 a Survey of Use of Researcher Profiling Services Across the 24 Russell Group Universities was published on this blog. This survey has been repeated in order to detect changes in the use of ResearchGate. Since the original survey also provided an analysis of, this was also included in the current survey. The results are given in the following table. Note that the data is also available in Google Spreadsheets.

Institution (members) ResearchGate
Aug 2012 Feb 2013
Members Publications
Aug 2012 Feb 2013* Members Publications
1 University of Birmingham 1,210 1,562  782 19,515 1,439 22,068
2 University of Bristol  1,018  1,189   641 21,249  1,251 
3 University of Cambridge  3,020  3,439   972 39,713 1,699 42,419
4 Cardiff University     906  1,071   646   9,596 1,272 10,696
5 Durham University  1,001 1,189  273  1,151    662   7,152
6 University of Exeter    919 1,106   269  5,150   652   6,191
7 University of Edinburgh  2,079 2,479
1,181 25,918 2,065 28,486
8 University of Glasgow 1,004
 1,212    613 20,041 1,224 21,733
9 Imperial College    798     896 1,096 30,404 1,377 34,202
10 King’s College London 1,420  1,748 1,406 18,264 2,241 23,391
11 University of Leeds 1,657  1,871    848  16,944 1,455
12 University of Liverpool   866     989   582  16,475 1,146 18,749
13 London School of Economics 1,131  1,354    191    1,838    407   2,449
14 University of Manchester 2,279  2,590 1,113  25,139 2,188 29,675
15 Newcastle University    906  1,039    704  17,307 1,348 17,376
16 University of Nottingham 1,299        1,529    970  20,513 1,559 20,145
17 University of Oxford 3,842        4,469 1,221  38,224 1,967 39,861
18 Queen Mary    715           849   228    5,232    898
19 Queen’s University Belfast    689           774   479 10,750    864 11,699
20 University of Sheffield  1,082        1,235   823 18,127  1,659 20,149
21 University of Southampton  1,083        1,265   670  16,887  1,371 18,325
22 University College London  2,776        3,162 1,624  35,035  2,878 38,550
23 University of Warwick 1,143        1,349    448
  8,098     873   9,334
24 University of York    986        1,180    386   4,841    696
TOTAL    33,829 39,546 18,166   426,414  33,191 477,103
Increase (%)    
  14.5%  82.7%    11.9%

Note: *  As described in the previous survey the numbers of members is obtained by entering the name of the institution in the search box.


Nos. of Researchgate publications

Nos. of items deposited in Researchgate in Aug 2012 (blue) & Feb 2013 (red)

Nos. of Researchgate Members

Nos. of Researchgate Members in Aug 2012 (blue) & Feb 2013 (red)

As illustrated in the accompanying diagrams it seems that the numbers of researchers who have signed up for a ResearchGate account has grown significantly over the past six months, and now stands at over 33,000 users, a growth of 82.7%. The numbers of papers which have been deposited by researchers at Russell Group universities has also grown to a total of over 477, 000 items. However since this represents a growth of 11.9% over six months it suggests that new members are providing metadata records only and not depositing the full text.

I therefore conclude that the conclusions I reached in my post which explained Why I’m Evaluating ResearchGate were correct and ResearchGate is a service which I should use not only to provide a presence about my research activities but also to host my research papers. I do wonder, though, whether the large numbers of items which have been deposited in ResearchGate is due to promotion of the service with the Russell Group universities or represents a bottom-up approach, in which researchers have recognised the benefits of the service and recommended it to their peers?

