The International Data Curation Conference (IDCC) continues to be about change.
That is, how do we change the eco-system so that managing data is an essential component of the research lifecycle? How can we free the rich data trapped in PDFs or lost to linkrot? How can we get researchers to data mine and not data whine?
While, for some, the pace of change is not quick enough, IDCC still demonstrates an impressive breadth of strategy and tactics to enable this change.
On the first day of the conference, Barend Mons set out the vision. The value of research is not in journals but in the underlying data – thousands and thousands of assertions about genes, bacteria, viruses, proteins, indeed any biological entity are locked in figures and tables. Release such data and the interconnections between related entities in different datasets reveals whole new patterns. How to make this happen? One part of the solution: all projects should allocate 5% of their budget to data stewardship.
Andrew Sallans of the Center for Open Science followed this up with their eponymous platform for managing Open Science for linking data to all kinds of cloud providers and (fingers crossed) institutions’ data repositories. In large-scale projects, sharing and versioning data can easily get out of control; the framework helps to manage this process more easily. They have some pretty nifty financial incentives to change practice too – $1000 awards for pre-registration of research plans.
Following this we saw many posters – tactics to alter behaviours of individuals and groups of researchers. There were some great ideas here, such as plans at the University of Toronto to develop packages of information for librarians on data requirements of different disciplines.
Despite this, my principal concern was the huge gap between the massive sweep of the strategic visions and the tactics for implementing change. Many of the posters were valiant but were locked in an institutional setting – the libraries wrestling how to influence faculty without the in depth knowledge (or institutional clout) to make winning arguments within a particular area.
What still seems to be missing from IDCC is the disciplinary voice. How are particular subjects approaching research data? How can the existing community work more closely with them? There was one excellent presentation on building workflows for physicists studying gravitational waves; and other results from OCLC work with social scientists and zoologists. But in most cases it was us librarians doing the talking rather than it being a shared platform with the researchers. If we want that change to happen, there still needs to be greater engagement with the subjects that are creating the research data in the first place.