Jisc Feasibility Study on Identifying Digital Repository Infrastructure Solutions for Small-to-medium Digital Projects

Yesterday in a post entitled Linkage: Funding, licensing, managing research data, LTI, Google Analytics cohort analysis and more Martin Hawksey reflected on a Jisc call for a “Feasibility Study on digital repository infrastructure solutions for ‘unsupported’ digital assets“. The call document describes how:

The feasibility study is required to identify sustainable digital repository infrastructure solutions for digital assets from small-to-medium digital projects. These assets may originate from arts organisations, cultural heritage institutions, community groups and small organisations in the area of the arts, cultural heritage, medicine and science etc that may not have access to a sustainable digital repository infrastructure.  

The Jisc has invested large amounts of money on the development of a repository infrastructure for the sector. But what exactly do we mean by an institutional repository and what purposes do they serve? There’s a danger, I feel, in looking for answers to these questions from within the sector – the ‘echo chamber’ may well decide that  institutional repositories can provide the functions required by Jisc funding and, surprise, surprise, they may do that well!

Looking at the ‘institutional repository’ article in Wikipedia we find that:

An institutional repository is an online locus for collecting, preserving, and disseminating – in digital form – the intellectual output of an institution, particularly a research institution.

The article goes on to describe how:

The four main objectives for having an institutional repository are:

  1. to provide open access to institutional research output by self-archiving it;
  2. to create global visibility for an institution’s scholarly research;
  3. to collect content in a single location;
  4. to store and preserve other institutional digital assets, including unpublished or otherwise easily lost (“grey”) literature (e.g., theses or technical reports).

But how does an institutional repository differ from a content management system (CMS)? According to Wikipedia a CMS:

is a computer program that allows publishing, editing and modifying content as well as maintenance from a central interface. Such systems of content management provide procedures to manage workflow in a collaborative environment.

The article summarises the main features of a CMS:

The core function and use of content management systems is to present information on websites. CMS features vary widely from system to system. Simple systems showcase a handful of features, while other releases, notably enterprise systems, offer more complex and powerful functions. Most CMS include Web-based publishing, format management, revision control (version control), indexing, search, and retrieval. The CMS increments the version number when new updates are added to an already-existing file. 

Is the distinction clear? Not to me, especially as the list of the features of a CMS goes on to describe how:

A CMS may serve as a central repository containing documents, movies, pictures, phone numbers, scientific data. CMSs can be used for storing, controlling, revising, semantically enriching and publishing documentation.

So perhaps, rather than exploring traditional repository software the feasibility study should explore possible CMS solutions.

The External Repository

Researchgate scoresBut must an institutional repository be hosted within the institution? After all if the point is to “to create global visibility for an institution’s scholarly research” might not that be achieved by using a popular Cloud-based repository service? The ‘Google juice’ which such services can provide will help address the limited Google juice available for institutional services (especially smaller institutions) which will have relatively small numbers of inbound links. Indeed as described in a paper which asked “Can LinkedIn and Academia.edu Enhance Access to Open Repositories?” even prestigious Russell Group universities do not have the Google ranking provided by services which are used globally.

Yesterday I received an email which suggested that such services are already being widely used. The message, from ResearchGate, began:

Brian, we’ve put together stats for thousands of institutions based on ResearchGate members.

Following the link from the email message I found a list of the UK institutions with the highest ResearchGate scores. (as illustrated). The ResearchGate Score incidentallymeasures reputation and impact based on how a researcher’s work is received by their peers. This list shows institutions by the sum of the RG Scores of their individual members using ResearchGate“. But rather than being sidetracked by a discussion about what such scores mean, for me the more relevant question is whether third-party repository services such as ResearchGate and Academia.edu have a role to play for small institutions which do not have the technical expertise to manage a conventional institutional repository service.

The Challenges Facing Small Projects and Unsupported Digital Assets

The development culture in large, well-funded research-led institutions has encouraged the use of in-house solutions, often based on open source software solutions. But in light of funding difficulties in the sector and the growing maturity of Cloud-based solutions it might be relevant, especially for small to medium sized projects, to consider such solutions. Clearly there will be a need to consider the sustainability of such services and possible changes to terms and conditions. But such issues can be addressed. There is also a need to consider the sustainability of in-house solutions – an issue which is very relevant to the services and content provided by UKOLN in light of the significant downsizing the organisation will face in less than two weeks time!

The Invitation to tender document document describes how:

The candidate models and subsequent options/ recommendations will need to take into account the level of digital skills capabilities and capacity of small-to-medium projects. Options will need to be easy to implement for non-specialists            contributors to any potential solution(s).

Ironically the document itself illustrates a lack of skills in best practices for using Microsoft Word. As illustrated below the author of the document created a line feed at the end of each line, rather than using MS Word line wrap for sentences. This will result in ugly line breaks if the font face or size if changed in the document and could cause problems if the document was converted into other formats. I also noticed the MS Word styles had not been used, which means that the document has no logical structure. As well as meaning that automated table of contents could not be provided this can also cause accessibility problems for people who use screen readers.

MS Word for Jisc ITT Document

Put simply, the document itself illustrates the challenges which may well be faced when content creators have limited digital skills capabilities.

I welcome this call as it encourages solutions which are applicable to the real world environment in which content providers may well create content which is not amenable for processing in the ways which the systems designers may have expected.