Boutiques, Shopping Malls and Specialist Shops
Posted: November 3, 2011 Filed under: Uncategorized Leave a comment »Boutiques, Shopping Malls and Specialist Shops
(or put your content where the users are, not where you are)
This presentation looks at why content owners such as universities, museums, archives etc need to deposit their digitised matieral not just on their own bespoke websites, but also the popular websites such as Google, Flickr, Wikipedia and others.
This presentation was originally given at the JISC Content Advisory Group, October 2011
Strategic or Open Digitisation?
Posted: November 2, 2011 Filed under: Uncategorized | Tags: collection, curation, digitisation, policy, strategy Leave a comment »The recent projects that JISC has funded as part of its Content Programme contain a fascinating range of materials – archives relating to the 18th-century Board of Longitude, the UK’s collection of fossils and reports documenting the health of modern London.
But the fascination of such an eclectic range of sources could also be construed as a weakness – the programme shows little deliberate join-up between the material being digitised.
This is very much a result of JISC’s approach; an open call, with each project being judged on its educational and technical merit, as part of a balanced portfolio of subjects and approaches.
An alternative strategy would be for JISC to, in consultation with the community, select a small number of strategic themes and request proposals only related to those themes, e.g. climate change, immigration to Britain or the history of European integration.
If four or five projects were funded in each of these themes, the opportunity to develop a critical mass of material is much greater. Many successful digitised resources (e.g. Early English Books Online – now available via the JISC Historic Books platform, or the Old Bailey Online) have succeeded by drawing material from diverse physical archives, but ensuring a focus on a particular community of practice.
But such an approach creates a number of challenges.
Above all, there exists the thorny question of what to focus on. A few years ago, JISC commissioned the Discmap survey in an attempt to marry researcher needs with outstanding non-digitised special collections in the UK. The report makes interesting reading (pdf), but only serves to show the breadth of both undigitised collections and researcher needs.
Alighting on particular fields, therefore, creates some specific risks. For instance by working with particular topics, one alienates whole reams of both curators, and researchers and teachers, whose fields have been excluded. For JISC, this has a remit to work with the whole HE community, this is an important factor.
Innovation is also important to JISC – indeed, its part of its very raison d’etre – and JISC wants to fund projects that integrate innovative practices into their digitisation. Experience has shown that innovation germinates in unexpected places. Sometimes bigger, well-established institutions – the type of place that would be more likely to play a role in ‘strategic’ digitisation – cannot innovate in the way the younger, more nimble organisations can .
Finally, developing strategic digitisation also entails partnerships. Working with others is great and helps create better digital resources, but they need time to grow and flourish. But when forced, they are more largely to cause friction, to the detriment of any joint output. In a landscape where there are plenty of large-scale organisations who need to achieve their own strategic goals, forging such broader partnerships can difficult.
Despite all that, the notion of a a critical mass being developed via a strategic approach remains appealing, especially if associated with a larger notion of a UK Digital Collection.
And JISC’s recent call in relation to World War One, and its completed programme of work in Islamic Studies, start to address this – seeking proposals that will pull together digitised content on a particular theme.
As funding tightens this is a discussion that will continue – do we want the creation of digital content to be focused on a select area and done in great depth or do we want a broad approach that creates a wider constituency of curators and users, but perhaps without the same intensity?
Isn’t Google digitising everything anyway?
Posted: October 10, 2011 Filed under: article, digitisation, digitization, evidence, google | Tags: digitisation, google, research Leave a comment »(This post first published on the JISC Corporate blog, October 2010)
Since Google embarked on its scanning of major world book libraries, there has been the assumption that there is little more to do in the field of digitisation.
Yet this is far from the truth. Opinions vary, but it is probably fair to say that more than 95% of the world books, magazines, newspapers, videos, films, documents still lay hidden in archives and libraries, inaccessible in digital form.
And there are numerous benefits to continue with the work of digitising all this content – it’s more than making it convenient for the learner to access something from the comfort of their own home or office.
So, for example, research is radically changed by the availability of millions of new documents, as shown by resources like the Proceedings of the Old Bailey, which is changing the face of the study of history of London.
Equally, costs of publishing and travel can be significantly reduced by open access journals, such as the 2m pages of text provided by the Wellcome Trust’s Medical Journal Backfiles digitisation.
The University of Oxford’s Great War Archive not only gathered and digitised the general public’s material evidence from World War One but enabled new communities and expertise to be developed outside the campus walls.
And projects such as Freeze Frame collection of polar photographs, or the Old Weather resource for transcribing weather reports in Naval logbooks, not only provide new data for educators and learners around the world, but also allow for a greater appreciation of the nation’s ‘prize jewels’ within its cultural and educational collections.
Much of the argument is laid out in a new JISC report written by Simon Tanner of King’s College London.Inspiring Research, Inspiring Scholarship is available as a pdf document from the JISC website.
Creating a Hive of Activity: Why we need to adopt APIs for Digitised Content
Posted: October 9, 2011 Filed under: APIs | Tags: APIs, linked data, open data Leave a comment »Creating a Hive of Activity: Why we need to adopt APIs for Digitised Content
Presentation from the 3rd EBLIDA-LIBER Workshop on Digitisation, October 2011
Crowdsourcing as Public Engagement
Posted: September 19, 2011 Filed under: crowdsourcing, Presentation | Tags: crowdsourcing ithaka "public engagement" Leave a comment »Crowdsourcing as Public Engagement
Presentation given at Ithaka Sustainable Scholarship conference, October 2011
Innovative use of crowdsourcing technology presents novel prospects for research to interact with much larger audiences, and much more effectively than ever before
Posted: August 25, 2011 Filed under: crowdsourcing | Tags: crowdsourcing, galaxy zoo, impact Leave a comment »(Originally published in LSE Impact Blog, 25 August 2011)
In the push to make clear and unquestionable links between research and its effects on society, academics with seemingly esoteric projects might struggle to make their work accessible and interesting to the public. But projects centring on Scots language dictionaries, tattered Greek papyri and Bentham’s philosophy of utilitarianism have all made the jump through innovative use of crowdsourcing. A growing number of projects, such as Ancient Lives, Transcribe Bentham, Old Weather and Scots Words and Places, are making sophisticated use of the web to actively engage the general public as contributors to their research.
Old Weather, for example, invites the general public to transcribe naval logs, thus providing crucial meteorological data for climate scientists, as well as opening up sources for the history of the British navy. Transcribe Bentham works with a range of groups, in particular schools, to decipher the numerous papers of Jeremy Bentham. For such projects, securing user contributions is about much more than impact. They provide a venue for communities outside academia to play a meaningful role within university research, providing insight and knowledge, saving time, and facilitating the route towards high-quality outputs.
It is worth remembering that crowdsourcing predates the digital era; the Oxford English Dictionary was initially built on contributions from volunteers and there is a long tradition of active contributions from the public within many fields of the social sciences.
But the development of crowdsourcing on the internet has rapidly accelerated the sophistication of its methodologies. Recent projects have been particularly adept at using social media, developing refined mechanisms for ensuring that contributions are quality assured, working with large data sets, and creating interfaces that interact in a way that reduces complexity and confusion.
These developments mean that there are suddenly novel prospects for future projects to interact with much larger audiences than previously, and to do so in a much more effective manner.
Of course, there are plenty of research projects that do not lend themselves to this kind of public engagement whatsoever. That’s fine. But for other projects, even those that could seem recondite in nature, there are opportunities to explore.
So as crowdsourcing advances, a vital factor will be the sensitivity with which the needs and motivations of those taking part are understood. If the research community engages the public in a utilitarian sense, as just cogs in a larger research wheel, then the whole methodology will become imperilled. Understanding what moves an inhabitant of a specific community, a child in the schoolroom, or the ‘silver surfer’ with a new internet connection, and making sure their input is suitably recognised is crucial.
Engagement, as Chris Batt pointed out in his report on the topic, must be a two-way conversation “knowledge co-creation and exchange rather than simply knowledge transfer: a dialogue which enriches knowledge for mutual benefit.”
The task of the University of Oxford’s RunCoCo team was to develop guidelines for projects wishing to develop digitised collections by asking the public to upload their own content or adding information to existing resources, as happened with the highly successful Great War Archive. Equally, the Citizen Science Alliance is working according to firm principles on how to interact with their users, as articulated in Arfon Smith’s podcast on the success of the Galaxy Zoo project. Indeed, the Alliance is now looking for other researchers with whom to work with and is requesting proposals for ideas.
If crowdsourcing is to continue to be embedded in research, then it is the principles and thinking drawn from RunCoCo or the Citizen Science Alliance that need to be adopted, adapted and implemented. There is a wealth of UK research that can be enhanced by the involvement of a engaged, knowledgeable and passionate UK public.
Opening Up Academic Research Projects – it’s not just about researchers.
Posted: August 20, 2011 Filed under: Uncategorized Leave a comment »Given the dramatic events concerning the recent riots in England, I was interested to find news of an AHRC-funded project Around 1968: Activists, Networks and Trajectories, based at the Department of Modern History at the University of Oxford.
According to the news item on the AHRC website, the project undertook “hundreds of interviews with former activists from the 1968 revolutions which shook Europe have been analysed and put online”. Useful context, I thought, for the very different types of riots that were happening in England.
However, on arriving at the website there were a number of issues that impeded me from getting access to the interviews.
First of all, I had to prove myself as a bona fide researcher. I’m not actually a researcher, but given that I work in a university context, this was not too difficult. Nonetheless it was irritating not to get immediate access.
But once into the database, it remained tricky to use the online collections.
The primitive search interface presumed one already knew exactly what type of content was in the database. There was no context to explain the precise nature of the content, nor how to browse through it a meaningful way.
Then when one did find an actual result, one was presented with a catalogue record. The interview was itself hidden away at the very bottom of the page, with little indication of what the speaker may be discussing.
The audio files were massive, some over 100MB, which tested the normally rapid Joint Academic Network (JANET) connection when the downloading process began. And rather than the files being cut up into smaller, more digestible snippets, one had to listen to the entire recording to glean any sense of it (although transcriptions for some of the material did help).
For anybody unfamiliar but interested in the archive, this all added up to a disappointing website. The site was (and remains) a great opportunity for the research team to engage with a multitude of interested parties. But they way the content was presented restricted its usage to a very narrow set of scholars.
To an extent, this is the result of the slow moving wheels of scholarly tradition.
Academics have a long and fruitful history of undertaking oral research with individuals and communities. For much of this time, there has been no feasible way to share the data collected with parties outside interested research groups.
And most importantly, those undertaking oral history are familiar with the need for anonymising or protecting the voices of those who have ‘dedicated’ their selves and their identities to a project, particularly in contested political or social areas. In legal terms, it’s called data protection, but ‘identity protection’ seems a more apt phrase.
This tradition means that researchers have not really considered disseminating their research collections. But the changing digital environment and its potential to attract and engage audiences outside the academy put a radically different spin on this. It offers researchers an ideal channel to disseminate aspects of their research to a much wider audience; an audience, consisting of taxpayers that are funding the researchers, that is showing much greater interest in why academic projects are funded. Neglecting this audience is no longer feasible.
So we need to have a different approach to compiling and disseminating oral history. Researchers must be more proactive in explaining why oral histories should be made openly available. Often, interviewees have a fear that appearing on the Internet will make their position more vulnerable. But on a world wide web that that contains billions of billions and pieces of content, where individuals expose their identities is myriad and often self-defeating ways, one oral history is often little more than a merest drop in the ocean.
And if there really are identity protection issues which need to be adhered to make the research project work, teams need to devise strategies to disseminate their research data. For example by exposing metadata, anonymising recordings, creating summaries, or editing versions of the original interviews. Although challenges exist, it is possible to make research outputs available to other researchers and a broader public, while respecting the concerns of interviewees and safeguarding privacy.
There can be no real excuse for getting half a million pounds of government funding and then allowing the fruits of that research, of a topic with relevance to contemporary concerns, to be available only to a narrow band of scholars.
Is crowdsourcing dumbing down research?
Posted: July 29, 2011 Filed under: crowdsourcing | Tags: crowdsourcing research Leave a comment »(Originally published in The Guardian, July 29 2011)
Whether your favourite tipple is a lanny or a craitur might depend on whether you’re a wine or a whiskey drinker, and even where in Scotland you live.
Last month an innovative new project funded by JISC asked people to contribute to a unique dictionary of Scottish words and place-names. The twist? Contributors are using tools of the web: posting messages on a Facebook page, tweeting the project team and contributing to an online discussion.
It’s the latest in a series of community projects that are asking the general public to contribute their knowledge and expertise to research through interactive web technology, not simply because they can or because it’s trendy, but because crowdsourcing is now, by default, digital. The idea behind this particular project is to focus firmly on how people are speaking now rather than the more traditional approach of largely gathering evidence from written material – so it makes perfect sense to go out to where people are, already tweeting, posting and updating their Facebook pages.
Two major factors have contributed to the growth of such projects. Web 2.0 technologies have developed to offer far more interactivity in the past few years – whether it’s adding comments to a page, video to YouTube or simply uploading photos to a central archive, content publishers now have more flexibility than ever before for interacting with a wide range of users. The British Library sound map project asks people to contribute audio recordings that are published on their webpages; JISC’s Strandlines project is assembling documents that articulate the history of one of London’s most famous streets, The Strand. As digital cameras, video devices and supporting software become more widespread, it’s possible to collate a range of media from the crowd when this might be very expensive to do independently.
But it’s not just about multimedia – the Old Weather project asks the public to transcribe Royal Navy log books from the early 20th century, which include valuable meteorological data recorded by ships’ crew members. Such an approach has a triple benefit – naval enthusiasts have whole new stories about British seafaring; military and other historians have fresh evidence and scientists have access to vital meteorological information to help them understand long-term patterns in climate change.
Researchers are seeing the advantages in developing meaningful relationships with businesses, public sector partners and community groups just as the universities they work for are actively developing their external engagement missions. These outside groups are sources of expertise, funding and advice but can also take research to wider audiences. Getting people involved means these users evolve to become both consumers and creators of digital data.
But when does ‘crowdsourcing’ work well? First, if you’re looking for expertise from a range of sources then the potential for ideas is massive. BMW received 4000 ideas within seven days of setting up its Virtual Innovation Agency which invites ideas for products and designs. The term crowdsourcing doesn’t seem to accurately cover the depth of this kind of activity.
Second, asking for contributions online can be an excellent option when funds are limited. JISC supported the Great War Archive project, which asked people to contribute photos and memories of their own wartime collections to a central website either directly online or through roadshows where they were brought along and digitised on the spot. The project team calculated that this was incredibly cost effective – each item submitted through the archive cost around £3.50 to ‘capture’, catalogue, and distribute, compared with around £40 per item when digitisation was managed in-house. The sheer scale of such collecting would also take much longer if you have a small team, whereas crowdsourcing can speed up a potentially time consuming process.
Idle computing power has long since been donated by those wishing to contribute to projects like the search for extra-terrestrial intelligence. But when some of the responsibility for content is pushed out into the public arena, is there a risk that we are trawling research data from the hands of those who know little about it? How do we balance the quantity of content we need with rigorous quality control?
The University of Oxford’s Galaxy Zoo, which asks the general public to describe and classify astronomical images has addressed this well. In addition to developing intelligent mechanisms for recording and analysing public contributions, the Oxford team and their partners ensure that they give due credit to their contributing ‘citizen scientists’ right from the outset – to the extent that they are cited as contributing authors in published articles. Galaxy Zoo demonstrates that we have to be prepared to share that balance of power with those who fund, contribute to and benefit from our research. Only by showing our processes and opening up our data, early findings and papers we are going to find support for the research of tomorrow. Just as big brands can build consumer trust by getting them involved in initiatives like MyStarbucks, so we can enhance the non-academic world’s trust in research by inviting them through the keyhole right from the start of our projects.
Crowdsourcing and Variant Digital Editions – some troubles ahead
Posted: July 18, 2011 Filed under: crowdsourcing | Tags: crowdsourcing Leave a comment »(This blog first published on JISC Digitisation blog, July 2011)
Projects like UCL’s Transcribe Bentham and New York Public Library’s What’s on the Menu? have done groundbreaking work in engaging the public to transcribe their manuscript collections.
Crowdsourcing allows rapid, and it seems high-quality, creation of transcribed data from original documents. Transcribe Bentham has so far created 1,330 transcribed versions, and only a handful have been rejected for a lack of quality. Previously, such scholarly transcription would have taken considerable time and effort, spanning many years.
With notable successes like these, crowdsourcing is now becoming more familiar as an academic tool. But for certain datasets, particularly ones of considerable academic importance, this could bring some problems with crowdsourcing having the ability to create multiple editions.
For example, the much-lauded Early English Books Online (EEBO) and Eighteenth Century Collections Online (ECCO) are now beginning to appear on many different digital platforms.
ProQuest currently hold a licence that allows users to search over the entire EEBO corpus, while Gale-Cengage own the rights to ECCO.
Meanwhile, JISC Collections are planning to release a platform entitled JISC Historic Books, which makes licenced versions of EEBO and ECCO available to UK Higher Education users.
And finally, the Universities of Michigan and Oxford are heading the Text Creation Partnership (TCP), which is methodically working its way through releasing full-text versions of EEBO, ECCO and other resources. These versions are available online, and are also being harvested out to sites like 18th Century Connect.
So this gives us four entry points into ECCO – and it’s not inconceivable that there could be more in the future.
What’s more, there have been some initial discussions about introducing crowdsourcing techniques to some of these licensed versions; allowing permitted users to transcribe and interpret the original historical documents. But of course this crowdsourcing would happen on different platforms with different communities, who may interpret and transcribe the documents in different way.This could lead to the tricky problem of different digital versions of the corpus. Rather than there being one EEBO, several EEBOs exist.
But this is part of a larger problem. If there are multiple versions of the original content, then which one is the one you use? In fact it’s not only about the content. Which platform works quickest? Which gives the most ‘accurate’ search results? Which one provides enhanced tools for analysis? Which gives the best results for your particular area of research? Where do you send your students? Which one do you cite?
Most importantly, which one do you trust? And why?
In ‘traditional scholarship’, different editions of original documents would be published at, for example, 50 year intervals, and it would be part of the scholarly workflow to review and criticise such editions. The complexity and proliferation of digital resources radically changes this – not only are there more digital resources but the knowledge and skills needed to critically analyse a resource are considerably widened out.
At the moment, there are no immediate solutions for these challenges. But it’s clear that the potential of the Internet continues to fracture existing practices of scholarship – despite the care, attention, and research intelligence that has gone into creating EEBO, ECCO and their various platforms, the potential for academics, funders, publishers to push forward and develop new digital ideas mean that thenotion of the Internet as a place where traditional scholarly practices can simply be repeated continues to disintegrate.
Digital resources made possible by JISC
Posted: April 8, 2011 Filed under: Uncategorized Leave a comment »(This blog was first published on the JISC Corporate blog, 8 April 2011)
The UK is a knowledge economy and as the coalition government looks to also to make it a digital one – how is JISC helping to share the UK’s knowledge and our resources online?
In my role at JISC I look after our content programme which brings scholarly collections into the digital age – taking journals, newspapers, manuscripts, photographs and other material and putting them on the web. I have the pleasure of working with many outstanding collections in the UK and have helped unearth some real treasures that can be shared and used for education and research.
The British Cartoon Archive is one such example. Hosted by the University of Kent, it represents a visual history of British history whether through the social comedies of Carl Giles or the political satire of Steve Bell. It provides the student with an alternative viewpoint on the century – not official documents, but a more slanted approach that provides a more accurate portrayal of public opinion. The video explains more.
The First World War Poetry archive, curated by the University of Oxford, is another astonishing collection. Incorporating the Great War Archive, where members of the general public where asked to submit images of objects relating to the war (letters, diaries, photos etc.), the resource is a seminal example of a crowd sourced website. The accompanying video tells some amazing stories that have been collected by the archive. In one story, we hear of a Scottish soldier, enlisted for war without the chance to say goodbye his family. He placed his goodbye message inside a matchbox and threw it onto the platform in the hope it would get to his loved ones. This video recounts the full story.
Most of the time I am looking at ways to promote these resources and create awareness amongst academics, researchers and learners that they exist. The JISC content site lists all the resources JISC has either funded or licensed for educational use. But one also needs to remember digitisation from the perspective of the creator, and the many things to take into account when putting collections online.
There are five pieces of advice that recent JISC funded projects have discovered have been crucial to successful digitisation projects.
Five top tips
1. Embedding digitisation within a university needs engagement, you need people on your side from across the whole of the organisation from researchers, academics and IT staff as well as senior management
2. Partnership is vital for those developing digitised content. Not just with other universities but with innovative publishers and producers
3. Digitised resources will achieve maximum impact when part of universities’ teaching and research strategies
4. Users love speed and convenience – one quick search over a federated website works better than multiple searches over disparate websites
5. Engaging external communities in digital content needs to be a two way process. It’s not just about universities broadcasting their expertise and exposing their digital content