A Vision for the Digital Humanities

The Digital Humanities does not exist. Or rather, it does not exist as a separate field, a bounded up box distinct from traditional disciplines.

Rather is it a metaphor, a powerful connector that allies existing disciplines with nascent ones. It is a connector that suddenly injects disparate subjects across the humanities with common concerns. Concerns over infrastructure, public engagement, scholarly communication. Over method.

Interdisciplinarity has always existed in the humanities, but the digital turn strengthens bonds in new ways. Archaeologists and linguists share needs for powerful processing infrastructures; philosophers need to reconsider their publishing strategies; theologians and historians suddenly have new audiences for their research opened up. A Sinologist suddenly has common cause with an historian of the book over XML mark up.

Grand Challenges

While this instinctive interdisciplinarity has influenced the restructuring of humanities faculties to include DH components, there is still doubt as to the effectiveness of the digital humanities. What have these ‘technological insurgents’ done to help answer the grand intellectual challenges facing the humanities? Therefore, an essential component of any contemporary digital humanities vision is helping address such challenges.

Aonach Tailteann Athletics- Croke Park: Hurdles Race | Independent Newspapers PLC

Hurdles Race (taken by Independent Newspapers PLC), National Library of Ireland, Rights Reserved

 

Any lasting vision for the Digital Humanities needs to strike a common basis with the broader faculty. It needs to be a venue to form teams not just in interdisciplinary sense but in the sense of methods and knowledge. A DH centre that wishes to thrive cannot content itself with tinkering with technological expertise and digital innovation.

Rather it needs to employ this knowledge in the framework of larger intellectual questions. A DH Centre must strike out and find common cause with those who are currently pursuing cutting edge themes but without incorporating digital methods. The task is not necessarily easy, but it is essential for the digital humanities to attain the respect it deserves.

Internationalisation

The Digital Humanities is a connector. Up until very recently it has failed to tap into global perspectives, with an emphasis on first world cultural history. This has relegated narratives from the global south, with the sundry effect of denying the relationships that exist at a global level.

Terrestrial pocket globe

Terrestrial pocket globe, Royal Museums Greenwich, CC-BY-NC-SA

 

By tackling questions of global import (often related to the grand challenges above), and by deploying open technical infrastructures and standards (for metadata, licensing, and via linked data), creating connections between the global objects of study in the humanities becomes much more feasible.

There is a softer side to this as well. The failure of the digital humanities to become truly global has practical roots; units in the global south can lack the finances, access to technology and general cultural to support DH. DH Centres can help in creating the alliances and sharing the infrastructures that would allow a global DH to blossom.

Scholarly Communication and Public Engagement

Inside and outside the academy, the humanities is undergoing a crisis. Public opinion is sometimes characterised by mistrust or disdain; government loathe the ambiguity, non-commercial and ideological aspects of the humanities. This has an obvious knock on effect in many ways – threatening student numbers and reducing government investment. The digital humanities can play a critical role in tackling this crisis.

Erasmus of Rotterdam

Erasmus of Rotterdam, Austrian National Library, Public Domain

This goes hand in hand with the changing landscape for scholarly communication. For the active DH centre, reconceptualising modes of scholarly communication is not an afterthought but an intrinsic part of examining how the humanities communicates amongst itself and with a wider public.

The adoption (and critical awareness) of new platforms for writing, visualisation, crowdsourcing, multimedia, and online resources themselves – allied to the reformed use of traditional outputs such as articles and monographs – can radically alter the humanists’ engagement with its audiences.

Any DH Centre must explore and build on these change, as well as tacking the grand international challenges of our time.


Truth in Art History – 3 Dutch examples

Art historians love interpreting paintings but they also love finding ‘true facts’ about paintings (excuse the postmodern snigger quotes). Three recent examples related to Dutch art history are below, two of which show a definite input from digital / scientific methodology.

next-rembrandt

The Next Rembrandt created an entirely fictitious Rembrandt portrait based on use the mass of existing technical data related to existing Rembrandt paintings.

van-gogh-colours

Van Gogh’s Bedrooms was an exhibition at the Art Institute of Chicago. It included results of chemical analysis that allowed conservators to claim they found the true, original colours of the painting (or at least one of them; Van Gogh painted three). While there has been much hubbub about the slowness of art history to adopt digital methods, it’s worth noting that conservators / technical art historians have been working with scientific analysis of paintings for a considerable time.

vermeer-street

Finally, the actual location of Vermeer’s The Little Street in Delft was revealed. However, this was based on painstaking analysis of archival material in its physical rather than digital form, checking extant maps and documents in archives to try and find the eponymous location.


#IDCC16: Atomising data: Rethinking data use in the age of explicitome

(Originally posted at Digital Curation Centre)

Data re-use is an elixir for those involved in research data.

Make the data available, add rich metadata, and then users will download the spreadsheets, databases, and images. The archive will be visited, making librarians happy. Datasets will be cited, making researchers happy. Datasets may be even re-used by the private sector, making university deans even happier.

But it seems to me that data re-use, or at least a particular conceptualisation of re-use as is established in most data repositories, is not the definitive way of conceiving of data in the 21st century.

Two great examples from the International Data Curation Conference illustrated this.

Barend Mons declared that the real scientific value in scholarly communication is not abstracts, articles or supplementary information. Rather the data that sits behinds these outputs is the real oil to be exploited, featuring millions of assertions about all kinds of biological entities.

Describing the sum of these assertions as the explicitome, it enables cross fertilisation between distinct scientific work. With all experimental data made available in the explicitome, researchers taking an aerial view can suddenly see all kinds of new connections and patterns between entities cited in wholly different research projects.

Secondly, Eric Kansa’s talk on the Open Context framework for publishing archaeological data. Following the same principle as Barend Mons, OpenContext breaks data down into individual items. Instead of downloading a whole spreadsheet relating to a single excavation, you can access individual bits of data. From an excavation, you can see the data related to a particular trench, and then items discovered in that trench.

open context image

(A screenshot from Open Context)

In both cases, data re-use is promoted, but in an entirely different way to datasets being uploaded to an archive and then downloaded by a re-user.

In the model proposed by Mons and Kansa, data is atomised, and then published. Each individual item, or each individual assertion, gets it own identity. And that piece of data can then easily be linked to other relevant pieces of data.

This hugely increases the chance of data re-use; not whole datasets of course, but tiny fractions of datasets. An archaeologist examining remains of jars on French archaeological sites might not even think to look at a dataset from a Turkish excavation. But if the latter dataset is atomised in a way that it allows it identify the presence of jars as well, then suddenly that element of the Turkish dataset will become useful.

This approach to data is the big challenge for those charged with archiving such data. Many data repositories, particularly institutional ones, store individual files but not individual pieces of data. How research data managers begin to cope with the explicitome – enabling it, nourishing and sustaining it – may well be a topic of interest for IDCC17.


Strategies and Tactics in changing behaviour around research data

The International Data Curation Conference (IDCC) continues to be about change.

That is, how do we change the eco-system so that managing data is an essential component of the research lifecycle? How can we free the rich data trapped in PDFs or lost to linkrot? How can we get researchers to data mine and not data whine?

While, for some, the pace of change is not quick enough, IDCC still demonstrates an impressive breadth of strategy and tactics to enable this change.

On the first day of the conference, Barend Mons set out the vision. The value of research is not in journals but in the underlying data – thousands and thousands of assertions about genes, bacteria, viruses, proteins, indeed any biological entity are locked in figures and tables. Release such data and the interconnections between related entities in different datasets reveals whole new patterns. How to make this happen? One part of the solution: all projects should allocate 5% of their budget to data stewardship.

Andrew Sallans of the Center for Open Science followed this up with their eponymous platform for managing Open Science for linking data to all kinds of cloud providers and (fingers crossed) institutions’ data repositories. In large-scale projects, sharing and versioning data can easily get out of control; the framework helps to manage this process more easily. They have some pretty nifty financial incentives to change practice too – $1000 awards for pre-registration of research plans.

Following this we saw many posters – tactics to alter behaviours of individuals and groups of researchers. There were some great ideas here, such as plans at the University of Toronto to develop packages of information for librarians on data requirements of different disciplines. 

Despite this, my principal concern was the huge gap between the massive sweep of the strategic visions and the tactics for implementing change. Many of the posters were valiant but were locked in an institutional setting – the libraries wrestling how to influence faculty without the in depth knowledge (or institutional clout) to make winning arguments within a particular area.

What still seems to be missing from IDCC is the disciplinary voice. How are particular subjects approaching research data? How can the existing community work more closely with them? There was one excellent presentation on building workflows for physicists studying gravitational waves; and other results from OCLC work with social scientists and zoologists. But in most cases it was us librarians doing the talking rather than it being a shared platform with the researchers. If we want that change to happen, there still needs to be greater engagement with the subjects that are creating the research data in the first place.


Making Research Data open  – starting with the low hanging fruit ? 

One thing I expected with my new job was to have difficult arguments with scientists about why they should make their data open. While it received lots of critical response on twitter, I had a sneaking feeling that ‘research parasite‘ editorial published earlier this year was actually reflective of a larger, if unstated, line of thought.

However, initial conversations at Delft seem to imply that something a little different. While there may be pockets of resistance, there are plenty of scientists who are intrigued or committed to openness.

Therefore I suspect the more difficult part of publishing opendata is actually sorting out the detail. How do deal with versions, problematic formats, getting high quality metadata. Working out costs. How best to pick the low hanging fruit, rather clambering to the top of the tree to find the fruit.

Part of the reason for this might be that TU Delft already has an impressive Open Science programme. The focus for ‘open’ is not just Open access for articles, but covers the lifecycle of the work in teaching and research. So there is an Open Education programme for MOOC and OERs and an open ICT programme, sitting along side the tpush for openness in Research Data. Whatever type of role a member of staff or student is playing, the exposure to openess will be present.


Notes from first day in TU Delft Research Data team 

Services offered by or through Delft Library Resarch Data Services 

  1. Data Archive – repository for storing data. Questionb of branding – archive or repository ? Smaller issues logged via Bugzilla and Trello; larger ideas for change still require evidence base throughout the universtiy before they can be implemented. EG Dark archive, or restricted access requirements, interface design.  Embargoes to be offered in self-upload form. Workflow involves data moderation process, checking both quality of metadata and technical quality of content. All conversations go through data officer. Rare to get functional process requirements from researchers,most are driven by library; eg implemtnation of Orcid 
  2. Datacite – international tool delivered via RDS, for giving DOIs to data. Library is Datacite member
  3. Dataverse – generic tool for managing data during research projects. Hosted by DANS; specific instance for 3tu (Delft, Eindhoven, Enschede) use
  4. Open earth Labs, created by Delft for Geo Sciences.Helps manage research data with focus on geo data; more complex than Dataverse
  5. Data Management Plan assistance. Greater number of requests for ‘Data Paragraphs’ in pre-proposals though rather than actuall fullplans 

On leaving Europeana (Part 1)

As of 1st February, I will be leaving Europeana (and The European Library) to take up a new role within the Research Data team at the Technical University of Delft.

I leave Europeana with a heavy heart. It is a unique organisation, with creative people and an ambitious desire to make significant change to how Europeana cultural heritage is shared in a digital world,

It’s not straightforward working there. Trying to create winning products based on strategic interests ranging from those of famous international galleries to tiny military museums, from renowned centuries old libraries to new city libraries and from thousands of archives, both jumbled and organised. Add in the multi-lingual element, the hugely different approaches to licensing and metadata across the continent, plus the friendly concerns of our funders at the Commission, and you are left as the juggler keeping several balls aloft.

Ringling Bros and Barnum & Bailey

Ringling Bros and Barnum & Bailey, Circus Museum, CC-BY-SA

In the face of that Europeana’s achievements are impressive. It has done so much to standardise licencing within the cultural heritage domain with the Europeana Licencing Framework. Many other fields of knowledge (eg the academic sector) are crying out for an approach like this. Once you have licencing harmony, re-use and remixing of big data turns from a distant possibility into achievable reality.

The Data Model helps make data interoperable. Without standardised data it does not hang together at all, not for a portal, not for linked data, not for an API. Enough said.

The recently published Publish Framework has a really nice carrot and stick approach to making the cultural heritage sector improve the quality of its content, and its ease of reuse. It’s not enough to stick a crappy low resolution jpeg behind a rubbish html page with a inexplicable URL. Content needs to be instantly accessible and downloadable and permanent, to both machines and humans.

Finally, I really like the way the portal is developing. Its recent redesign is much easier on the eye. More importantly, the importance on thematic collections (starting with art history and music) is vital to give some focus. Developing a content strategy that helps create a critical mass of content and metadata is, in my humble opinion, one of the most important things Europeana can do in 2016.

From a personal note, here are three things I am really proud of at my time here:

  1. The European Library assembled one of the largest releases of open data in the cultural sector. The Linked Open Data release of over 90m bibliographic records was achieved only after a  massive process of ingesting data, working through the licencing conditions for many different national libraries, and then working with the team to create and publish the linked data.
  2. The great team at The European Library also put together the largest open archive of historic newspaper in Europe. Centralising data from over 20 libraries, with over 11 million image and pages of full text, was a mammoth achievement. It is very gratifying to look at the user stats – the average user spends nearly 15 minutes on the site – an incredibly high figure.
  3. Finally, introducing Europeana Research. There is so much potential for the cultural sector and the  digital humanities to work closely together. Some of the Europeana plans for the next year, including a grants programme for researchers to use open cultural data, look really exciting. To be part of the team getting this off the ground was a privilege.

None of this would be possible without all the people both within Europeana and at other project partners. I can’t name everyone (and you are great even if you are not on this list !), but some of the great people I have worked with closely in the office over the course of the four years include: Nienke and her infectious drive and optimism, Markus, Alena and Nuno’s technical genius, Natasa and Adina’s tenacity with data, Valentine and David’s stupendous all-round knowledge,  and Harry and Jill’s passion and commitment. There may be many balls in the air in Europeana, but there are also many safe hands to catch them.


Follow

Get every new post delivered to your Inbox.

Join 2,739 other followers