One of the great problems of digitisation programmes is the time and cost of copyright clearance. There can be problems not just in getting the agreement of copyright holders, but actually finding the holders in the first place.
Many projects have to undergo due diligence searches to reach copyright holders – a time consuming mix of Google searches, general advertising and emails and letters sent off, often as ‘shots in the dark’.
The Arrow project (and its sequel ArrowPlus) are the first big building blocks in a EU-wide infrastructure, providing a tool that should provide managers of large-scale digitisation programmes with copyright information on books they want to digitise.
One of projects’ test cases involved undertaking a due diligence search on 1,700 books (from 19th and 20th century) on genetics that the Wellcome Trust wished to digitise, returning information on whether a book was in or out of commerce, if it was an orphan work or who the copyright holder was.
To function, Arrow requires the bringing together of three sets of databases in the 16 countries involved – their national library catalogues, their books in print database, and their databases of rightsholder information. This is no easy task with massive interoperability and data quality challenges. For some countries, the databases do not exist or commercial interests act against such centralised collection of data.
It’s an exceptionally ambitious piece of work, indicative of the EU desire to create large chunks of infrastructure to solve pan-European problems. If it works, its impact will be tremendous, paving the way to the digitisation of many 20th-century books.
But the challenges are great. Sustaining the Arrow system will have a basic administrative cost of Euro 100k a year, and it’s not clear (to me at least) who will support that annual cost.
It should also be noted that Arrow only supports book rights clearance (although visual images are being explored) and large-scale queries. There is no webpage point to make queries over individual books (for reasons I have not grasped). For most digitisation projects (small scale, and digitising non-book material) the system does not really help.
The mass digitisation projects that could support the costs of Arrow are not frequent, but they might still happen. The French are talking about digitising 2m out of commerce works. Arrow could still play a role for the Google digitisation programme. But whether such irregular programmes can sustain the system, only time will tell.