One of the most common requests by new research projects at TU Delft is for a tool that can manage their all kinds of research data during a project and also deal with other types of data created during a project – for example, steering group minutes, presentations, interview permissions.

osf-screenshot.PNG

Often, projects ending up using a mixture of tools (Basecamp, Google Drive, GitHub, SharePoint) that have different advantages and disadvantages

In this light, I’ve had an introductory look through the science-focussed Open Science Framework (OSF), that provides tools to help the entire workflow. Some of the advantages are listed below.

  • Very quick start up time– it’s possible to get a project up and running in a couple of minutes
  • Possible to upload and categorise all kinds of data and files. For example, ‘methods’, ‘hypotheses’ and ‘communication’
  • Ability to store versions of data – revisions to each file can be stored
  • Different files can have different levels of permission. OSF introduces the concept of component to help organise files and data in different ways. Each component can have different levels of access (e.g. admin, read/write, read only). This is very useful for projects involving multiple institutions and data requiring protection.
  • Ability to create public versions of parts of projects, with citations. For fully-fledged projects that wish to share data and ensure appropriate attribution this could be a strong pull.

Other questions that the usage of OSF raises:

  • How efficiently does OSF deal with big data sets? Individual files can be no more than 5GB. For larger files, linking to add-ons such as Dropbox is possible, but it would be interesting to see if OSF retains its speed when accessing multiple large data sets
  • How does it work with third party tools? Integration with common Cloud Apps such as Google Drive is already included. But for some research projects it will be the ability of the tool to connect to specialist code, tools and instruments could make OSF much more useful. But such integration is challenging. For example, how could a sensor recording meteorological data on a daily basis automatically transfer data to OSF? Or how could OSF expose data from traffic logs to allow the visual analysis of movement of cars, buses and lorries in a city? OSF have made their API public to respond to such goals, but that requires developer time to integrate
  • If data is being made public and being given a DOI for use in citations, the OSF will need to work hard to ensure long-term sustainability and the trustworthiness of the data. It will still be useful for research projects to deposit their final published data in a repository that accords with the Data Seal of Approval, for long-term curation of data.