Working both at the Arts and Humanities Data Service, as Programme Manager for the Jisc’s Digitisation and Content Programme and at The European Library, I’ve been lucky enough to work with hundreds of digital resource projects in the UK and beyond.
Most of them have had fabulous, engaging content. But many have had serious problems in working out where to go after the initial project digitisation has ended.
The Old Bailey Online, on the other hand, has been striking in expanding beyond the initial digitisation work and exploring the implications of having such a largetranscribed corpus available. So we have seen
Criminal Intent – Not just full-text search – Undertaking big data analysis over the corpus
Connected Histories – Searching over multiple related sources related to early modern Britain
Locating London’s Past – Geographically maping crime and other social issues
Voices from the Old Bailey – Public History on the radio
Garrow’s Law – Inspiring and Informing a BBC tv drama series
Using the Old Bailey for Teaching -
Exploring the issues of citation and impact (pdf)
Plenty of other routes have been discussed and challenged – eBooks being written which dynamically site data from the website; student unconferences; exposing the API to various other resources
Many digital resources offer the chance to do new radically news types of scholarship and public engagment, but we’ve not always grasped the opportunity to do so. The team behind the Old Bailey Online have done this, sustaining not just the original resource, but pushing it down new methodological and digital channels that others have feared to tread
One of our tasks for 2013 is to create a single linked open data set of this data aggregated. Unsuprisingly, one of the biggest challenges is the related licencing; approaching each libary who is a member to get agreement over the terms and conditions of the linked open data.
From the individual library’s point of view, they need the chance to discuss and ensure that the European Library dataset they are contributing to fits in with their own metadata strategies. From the potential end user’s point of view, they need a data set that is rich as possible and has harmonised licencing.
The success of the work will be in balancing these two viewpoints.
I hope most libraries will be keen to release at least basic metadata as CC0. This follows the trends at libraries such as the British Library and the National Library of Sweden (the National Library of France, meanwhile, has an open licence which equates to CC-BY).
The first issue will be in defining what is meant by basic metadata. Each library might have a different view. Some libraries might be happy to share their entire metadata set; others may only want to release a limited set of elements at CC0
But I suspect the more difficult issue will be the licence terms for data that libraries do not wish to release as basic CC0. Some may choose similar open licences, others may opt for more restrictive terms.
The difficulty then arrives when you then want to put all this metadata together in one unified corpus. In such a case it would need to be released under the more restricted licence – the lowest common denominator. So even if 95% of the libraries gave The European Library to release their full metadata under licence Type A, the complete aggregated dataset would have to be released under the licence Type B stipulated by the other 5%.
Thus the real pressure in this task is trying to convince the libraries to use similar terms for their licencing, creating the richest possible dataset.
Not much sign of the digital alas ! http://www.ahrc.ac.uk/News-and-Events/News/Pages/AHRC-Strategy-2013-2018.aspx
The National Library of Wales has just launched its Welsh Newspapers Online, which will make available over 1m pages of digitised newspapers from their holdings (and also some Welsh papers held by the British Library).
The launch event was held last night in Cardiff and much was made of the fact that government and EU funding had enabled the National Library of Wales to make the content free and openly accessible. Nearly all of the content on the Welsh site is available under the public domain mark.
This is marked contrast to the business model underpinning the British Library’s digitised newspapers, which requires payment for individuals to see the scans.
The different approaches reflect different institutional contexts. The British Library (BL) is under much more pressure from the UK government to seek alternative (ie non public) funding for digitisation. The Welsh Assembly, on the other hand, sees digitisation as an opportunity to showcase Welsh content and history to a much broader audience. Putting paywalls in place, they reckon, would limit that.
The British Library also has a much larger collection (75m pages compared to 3m or so – I think – that are held in Aberystwyth); the imperative to find external funding to digitise the entire collection, as provided with the deal with publishers BrightSolid, is stronger. The BL also to provide modern facilities for housing the physical collection, prompted by the move from the one time location for the newspapers in Colindale in north London.
It will be interesting to compare the impact of the two different approaches as time passes.
A presentation done on the Europeana Libraries project, which was reviewed and given thumbs up by the reviewers.
Luke McKernan, Curator for Moving Images at the British Library posted this amusing tweet at a conference last year.
Aided by the pithiness of Twitter, Luke likened the doomed attempt of the Danish king Canute to hold back the sea to Europeana’s attempts to provide a European, culturally aware alternative to a Google.
In one sense, Luke’s comment hits the mark. Given Google’s accumulation of expertise, massive stores of data and sheer financial muscle, it will always be difficult for Europeana to match Google’s core strength – that of providing a search engine that points users to the stuff they want to find.
And while Europeana can point to a ‘long tail’ of aggregated content, some of Google’s spin off projects (such as Google Art Project or the Google Cultural Institute) achieve an instant user traction that Europeana will find difficult to match.
But focussing just on the portal misses where Europeana can really have an effect on Europe’s digital cultural heritage.
Rather a fantastic strength of Europeana is its ability to advocate the necessary changes that allow digitised content from Europeana cultural heritage institutions to be opened up for the widest possible range of uses and users.
Europeana is not just about the Europeana web portal but the continent’s astonishing network of libraries, museums, archives and other collections; all institutions that are dealing with the challenges faced by digitising and publishing collections in the age of the Internet.
Clear licencing terms; high-quality openly available metadata; a strategic approach to user access to materials (often, but not necessarily always, open access); use of open standards; use of addressable URIs for digital items – these are all essential ingredients for deriving the maximum richness from Europe’s online cultural heritage.
Europeana has a key role of helping create and put forward the winning arguments that persuade institutions to adopt these best practices.
There is still much to do – 64% or items do not come with clear licencing terms (ppt, slide2); and about 3% (c. 660,000 digital objects) have broken links. But, for example, the work done by the Rijksmuseum in the Netherlands, the Statens Museum for Kunst in Denmark, the British Library’s medieval manuscript department, and indeed the exposure of Europeana’s own dataset of 22m metadata records, demonstrate that some institutions are making significant decisions about sharing their content online.
A tide is definitely turning.
Library Relations Coordinator
Are you an experienced account management professional, with enthusiasm for the libraries sector? Would you like to work in an English-speaking international team? Then come and join The European Library team at the Europeana Foundation, based in the National Library of the Netherlands in The Hague.
To manage all customer relationship-related activities for The European Library and associated projects
Terms and conditions of employment
The salary for this post is in line with the Collective Labour agreement for Research Centres, scale 9 or 10, (€ 2.500 – € 3.800,- gross per month, holiday allowance and annual bonus not included) depending on your experience and skills. In addition, you will have 42 days holiday per year, a holiday allowance (8%), an annual bonus (around 8%) and good fringe benefits.
This position is for 12 months in the first instance. Any renewal is subject to suitable performance and availability of funding. The contract is for a maximum of 40 hours a week.
How to apply
For questions about the job please email Aubery Escande email@example.com. Please send your CV and application letter to firstname.lastname@example.org, clearly marked ‘Library Relations Officer, The European Library’.
Final acceptance date for application: 21st January 2013 at 12:00 CET
First telephone interviews will be held during the week beginning: 28th January 2013
Presentation on The European Library given at RLUK 2012, Newcastle
(Internship detailed posted on behalf of my employers)
Collections Intern for The European Library
Are you a soon-to-be qualified librarian? Would you like to work in an English- speaking international team? Then come and join The European Library team, based in the National Library of the Netherlands in The Hague.
We are looking for an intern to join our Collections Team. Your internship will be to liaise with national and research libraries across Europe, organising, scheduling and prioritising the ingestion of their collections into The European Library.
We are looking for someone who has qualifications in librarianship or a related subject. You must be able to speak and write English to a very high standard.
The European Library
Launched in March 2005, The European Library is a free service that offers a single point of access to the bibliographical and digital collections of the National Libraries of Europe. It is a service of the Conference of European National Librarians (CENL) and is hosted in the Research and Development Department of the National Library of The Netherlands in The Hague. 48 national libraries in Europe have included their collections in The European Library. From 2012, we are incorporating data from Europe’s major research and university libraries. With the development of Europeana, the cross-domain portal for Europe’s cultural and scientific heritage, The European Library has become the library-domain aggregator for Europeana.
Terms and conditions of employment
The internship is available now and will be offered for an initial period of three months, which may be renewed. A monthly internship allowance according to KB regulations is payable, plus travel expenses to and from work.
How to apply
For questions about the internship please email Louise Edwards, General Manager
(email@example.com). Please send your CV and application letter to the KB applications mailbox
(firstname.lastname@example.org) for the attention of Louise Edwards with ‘Internship TEL’ in the subject line.
Final acceptance date for applications: 18th June 2012 at 12:00 CET
One of the great problems of digitisation programmes is the time and cost of copyright clearance. There can be problems not just in getting the agreement of copyright holders, but actually finding the holders in the first place.
Many projects have to undergo due diligence searches to reach copyright holders – a time consuming mix of Google searches, general advertising and emails and letters sent off, often as ‘shots in the dark’.
The Arrow project (and its sequel ArrowPlus) are the first big building blocks in a EU-wide infrastructure, providing a tool that should provide managers of large-scale digitisation programmes with copyright information on books they want to digitise.
One of projects’ test cases involved undertaking a due diligence search on 1,700 books (from 19th and 20th century) on genetics that the Wellcome Trust wished to digitise, returning information on whether a book was in or out of commerce, if it was an orphan work or who the copyright holder was.
To function, Arrow requires the bringing together of three sets of databases in the 16 countries involved – their national library catalogues, their books in print database, and their databases of rightsholder information. This is no easy task with massive interoperability and data quality challenges. For some countries, the databases do not exist or commercial interests act against such centralised collection of data.
It’s an exceptionally ambitious piece of work, indicative of the EU desire to create large chunks of infrastructure to solve pan-European problems. If it works, its impact will be tremendous, paving the way to the digitisation of many 20th-century books.
But the challenges are great. Sustaining the Arrow system will have a basic administrative cost of Euro 100k a year, and it’s not clear (to me at least) who will support that annual cost.
It should also be noted that Arrow only supports book rights clearance (although visual images are being explored) and large-scale queries. There is no webpage point to make queries over individual books (for reasons I have not grasped). For most digitisation projects (small scale, and digitising non-book material) the system does not really help.
The mass digitisation projects that could support the costs of Arrow are not frequent, but they might still happen. The French are talking about digitising 2m out of commerce works. Arrow could still play a role for the Google digitisation programme. But whether such irregular programmes can sustain the system, only time will tell.