Tuesday, April 08, 2008

When HEFCE underspends: a £22 million JISC digitisation project

In 2004, a £10 million HEFCE underspend [crikey Moses!] resulted in a windfall for JISC: Jean Sykes recounts being told to "spend this in 2-3 years on large scale digitisation projects, please."

JISC reviewed a list of extant proposals for content digitisation, but considered it important to consult the community and bring new bids to the table. 6 major projects were selected for Phase 1 - the largest digitisation activity in Europe - ranging from 18th century British parliamentary papers to British Library archival sound recordings. [Was it these chaps who had Charlotte Green in stitches last week?] The latter group set up a user panel to help decide which of the masses of recordings in the archive should be prioritised for digitisation.

Standards had to be agreed across all projects, and multimedia in particular presented a variety of obstacles. But from this, a JISC digitisation strategy is emerging. Lessons were learned:
  • user consultation (do it - and get some experts in)
  • procurement (technical and commercial issues)
  • metadata (metadata, metadata - importance cannot be overstated - build it in from the outset)
  • quality assurance and evaluation throughout the project
  • impact assessment (an increasingly big deal - projects now need to build in licences and metrics from the start)
  • project management - and capturing of lessons learned
  • interface accessibility
  • promotion of the finished service.
Phase 2 covers another 16 projects with a further £12m funding from JISC (big up those crazy HEFCE underspends!). Seven thousand reel to reels! Four thousand hours of recordings! Fifteen thousand Giles cartoons! Three thousand high quality Pre-Raphaelite images! Fifteen thousand theatrical objects! Half a million pages of Cabinet Papers! Over one million pages from national, regional and local newspapers! Five thousand university theses! Great War poetry and contextual archive material! [Apologies for all that terribly unliterary exclamation, but really, the breadth and scale of this stuff is staggering - did I already say Three cheers for HEFCE underspends?] Phase 1 and 2 projects will be free at the point of use to UK HE and FE, and some to schools and public libraries.

And now they're already preparing for Phase 3 (and here was I thinking Phase 3s are merely the product of an over-optimistic imagination). Work is underway to assess impact/usage of Phase 1 projects, which unfortunately did not have statistics built in from the outset so some qualitative indicators will need to be used. A gap analysis will be conducted to assess the community's needs, and the development of thematic portals will be investigated to make resources more comparable and usable (these could be extended to cover JISC collections, too). Future sustainability remains a big challenge - keeping digitised content accessible; migrating it to future formats and platforms; updating collections with new content. Ultimately, librarians may need to be prepared to subscribe to this content to ensure its preservation.

Labels: , , , ,

Maximising access to, and understanding of, major archives

Dan Jones owns the Domesday Book.

Well, not quite, but it is housed in the National Archives, where he works. The Domesday Book is just one of the 60 million documents available for immediate electronic download (cripes!). Their approach is driven by changing user behaviour (increasing web literacy and expectations) and the pervasiveness of high bandwidth broadband. But in digitising their archives they must contend with over 175km of shelving and over 10 million catalogue entries; Dan's "back of the fag packet" estimate of costs to digitise all this data is over £5 billion (double cripes!).

Models of digitisation
The Archives digitisation activities are funded from internal budgets, commercial investment and grants. Segmentation of the target markets [and presumably funders' mandates?] informs decisions about which services are charged for, and which are free at the point of use.

Strategic partnerships
Content is digitised in different ways depending on demand: strategic partners are contracted for high-demand items, and digital assets, once created, are non-exclusive - i.e. available for repurposing within other services. One current project is the 1911 census. 5 scanners are running round the clock to create 40,000 images per day; these are QAd and transcribed in the Philippines, enabling details of over 35 million individuals to be comprehensively searched. The data will lend itself to use by genealogists, academics, schools and for statistical analysis. The strategic partnership through which the project is operated minimises the risk for the National Archives and allows them (as a facilitator) to simultaneously carry out other work. But there is a potential for the partners' interests to diverge, and the project's agenda has to be balanced to represent the interests of a broader stakeholder group.

Internal delivery
The JISC-funded project to digitise Cabinet Papers from 1916-75 is complex - the papers are handwritten, and don't lend themselves well to digitisation.

Providing context
Autonomy search has been deployed to provide an integrated search function across all databases and websites. Newer archives have been loaded into a Wiki-based resource which allows individual experts to contribute their ideas and information; "some of our users are far more expert in particular areas of these holdings than we are ourselves".

Future challenges
  • Georeferencing will allow map collections to be unlocked
  • organic growth of legacy systems means experts need considerable training to operate systems (so tools need to be developed)
  • customer tools also need to be constantly reconsidered
  • projects and programmes need to be financially sustainable.

Labels: , , ,