RO-Crate 1.0 specification released

Posted by & filed under Uncategorized.

The community recommendation RO-Crate 1.0 has been released.

RO-Crate (Research Object Crate) specifies a method of organizing file-based data with associated metadata, using Linked Data principles, in both human and machine readable formats, with the ability to include additional domain-specific metadata.

The core of RO-Crate is a JSON-LD file, the RO-Crate Metadata File, named ro-crate-metadata.jsonld. This file contains structured metadata about the dataset as a whole (the Root Data Entity) and, about some or all of its files within a data package. This provides a simple way to, for example, assert the authors (e.g. people, organizations) of the RO-Crate or one its files, or to capture more detailed provenance and metadata for files, such as how they were created using contextual entities such as software, equipment, geographical places, funders and subjects.

The RO-Crate specification is based on, providing an opinionated specification of its use to describe Research Objects, with practical use guides and examples for software authors to create tools for generating and consuming research data packages.

Building community consensus

RO-Crate is a fresh initiative, bringing together data archive and repository maintainers with existing Research Object, workflow and provenance communities. Starting as a small cross-domain group, organically formed to build the core principles and first sketches of their use, we are now expanding to collect use cases and reaching out to other packaging initiatives to build common ground.

For the eScience LAb, one emerging use of RO-Crate is for capturing workflows and tools in a federated workflow repository being built in EOSC-Life, a large European Open Science Cloud project across 13 research infrastructures in the life science domain. However RO-Crate is also aiming to be usable by individual Data Scientists and Digital Library efforts with no particular infrastructure beyond Jupyter notebook, for developers who may not have the time or motivation to use a cascade of metadata vocabularies and research data management tools.

RO-Crate development and discussion is done openly in a GitHub repository by volunteers, with monthly telcons to synchronize the effort. Anyone can join to help form the RO-Crate approach.

Cite RO-Crate

Eoghan Ó Carragáin; Carole Goble; Peter Sefton; Stian Soiland-Reyes (2019):
A lightweight approach to research object data packaging. Bioinformatics Open Source Conference (BOSC2019)

Peter Sefton, Eoghan Ó Carragáin, Stian Soiland-Reyes, Oscar Corcho, Daniel Garijo, Raul Palma, Frederik Coppens, Carole Goble, José María Fernández, Kyle Chard, Jose Manuel Gomez-Perez, Michael R Crusoe, Ignacio Eguinoa, Nick Juty, Kristi Holmes, Jason A. Clark, Salvador Capella-Gutierrez, Alasdair J. G. Gray, Stuart Owen, Alan R Williams, Giacomo Tartari, Finn Bacall, Thomas Thelen (2019):
RO-Crate Metadata Specification 1.0. Community Recommendation.

This news item has been adapted from the an abstract at *Workshop on Research Objects 2019* at *eScience 2019*, see for details.

Example RO-Crate Metadata File

{"@context": "",
 "@graph": [
  { "@id": "ro-crate-metadata.jsonld",
    "@type": "CreativeWork",
    "conformsTo": {"@id": ""},
    "about": {"@id": "./"}
    "@id": "./",
    "@type": "Dataset",
    "identifier": "",
    "name": "RO-Crate specification dataset",
    "version": "1.0.0",
    "license": { "@id": ""},
    "datePublished": "2019-11-15",
    "hasPart": [
        {"@id": "ro-crate-1.0.0.html"},
        {"@id": "ro-crate-context-1.0.0.jsonld"}
  { "@id": "ro-crate-1.0.0.html",
    "@type": ["CreativeWork","File"],
    "encodingFormat": "text/html",
    "name": "RO-Crate Metadata Specification 1.0",
    "identifier":  {"@id": ""},
    "version": "1.0"
    "@id": "ro-crate-context-1.0.0.jsonld",
    "@type": "File",
    "encodingFormat": "application/ld+json",
    "name": "RO-Crate JSON-LD Context",
    "identifier": {"@id": ""},
    "license": {"@id": ""},
    "author": [
        { "@id": ""},
        { "@id": ""}
  { "@id": "",
    "@type": "Person",
    "name": "Peter Sefton"
  { "@id": "",
    "@type": "Person",
    "name": "Stian Soiland-Reyes"

2019-09-24: Workshop on Research Objects (RO2019)

Posted by & filed under Event.

Call for Papers

  • Title: Workshop on Research Objects (RO2019)
  • Abstracts/papers due: 24 June 2019
  • Workshop: 24 September 2019
  • Where: IEEE eScience 2019, San Diego, CA, USA
  • URL:


Deadlines have been extended:

  • 2019-07-05 RO2019 submissions due: articles
  • 2019-07-15 RO2019 submissions due: abstracts for oral presentation
  • 2019-07-25 RO2019 notification of acceptance
  • 2019-09-02 RO2019 poster/demo submissions due
  • 2019-09-24 RO2019 workshop at IEEE eScience 2019

Research Objects

Scholarly Communication has evolved significantly in recent years, with an increasing focus on Open Research, FAIR data sharing and community-developed open source methods. A question remains on how to publish, archive and explore digital research outputs.

A number of initiatives have begun to explore how to package and describe research outputs, data, methods, workflows, provenance and structured metadata, reusing existing Web standards and formats.

Such efforts aim to address the challenges of structuring multi-part research outcomes with their context, handling distributed and living content and porting and safely exchange what we collectively can call “Research Objects” between platforms and between researchers.

Call for Papers

In the workshop RO2019 we will explore recent advancements in Research Objects and publishing of research data with peer-reviewed presentations, invited talks, short demos, lightning talks and break-out sessions to further build relationships across scientific domains and RO practitioners.

RO2019 welcomes submissions of academic abstracts (~ 1-2 pages) and short research articles (~ 4-8 pages) on cross-cutting case studies or specific research on topics including, but not limited to:

FAIR metrics; platforms, infrastructure and tools; lifecycles; access control and secure exchange; examples of exploitation and application; executable containers; metadata, packaging and formats; credit, attribution and peer review; dealing with scale and distribution; driving adoption within current scholarly communications and alignments with community efforts; and domain-specific and cross-domain Research Objects.


Submitted abstracts and articles can be in a range of open formats (e.g. HTML, ePub) and are particularly encouraged to be submitted in a FAIR research data packing format.

Accepted articles will be included in the IEEE eScience 2019 proceedings. Submitted preprints will, upon acceptance, be made available as Green Open Access on the RO2019 website with DOI links to the Zenodo record and (where applicable) the published IEEE proceeding article.

It is a requirement that at least one author of each accepted submission attends the RO2019 workshop at the IEEE eScience 2019 conference, where registration fees applies.

Further details on submitting:

RO2019 encourages open peer review, and recommend that reviewers are named and attributed; however reviewers may be anonymous if so desired. Reviewers are welcome to publish their reviews using the same guidelines as the research articles.

Workshop organizers

  • Carole Goble (The University of Manchester, UK)
  • Raul Palma (Poznan Supercomputing and Networking Center, Poland)
  • Stian Soiland-Reyes (The University of Manchester, UK; Apache Software Foundation)
  • Daniel Garijo (University of Southern California, US)

For any questions, feel free to email the RO2019 Workshop Organizers at

2018-10-03 Being FAIR: Enabling Reproducible Data Science

Posted by & filed under Presentations.

On 3 October 2018 Carole Goble presented Being FAIR: Enabling Reproducible Data Science at The Early Detection of Cancer Conference in Portland, Oregon:

The talk was presented in the Data Science for Early Detection session, chaired by Brendan Delaney and Parag Mallick.

The session also included talks by Jim Davies, Imran Haque and Sylvia Plevritis.

The Early Detection of Cancer Conference is an international collaboration between The Knight Cancer Institute at Oregon Health & Science University, the Canary Center at Stanford University and Cancer Research UK.

2017-12-07 NIH Data Commons Pilots kick off

Posted by & filed under News.

On 2017-12-07, the NIH Data Commons kicked off its Pilot Phase (stage 1), where a consortium that will collaborate to develop the key capabilities.

The NIH Data Commons will be implemented in a four-year pilot phase to explore the feasibility and best practices for making digital objects available through collaborative platforms. This will be done on public clouds, which are virtual spaces where service providers make resources, such as applications and storage, available over the internet. The goal of the NIH Data Commons Pilot Phase is to accelerate biomedical discoveries by making biomedical research data Findable, Accessible, Interoperable, and Reusable (FAIR) for more researchers.

Quote from press release NIH awards to test ways to store, access, share, and compute on biomedical data in the cloud (2017-11-06), emphasis added.

The NIH Data Commons Pilot Phase Consortium (DCPPC) joins together nine strong teams with expertise in metadata and scalable data management.

Read more »

2017-11-15 Managing Digital Research Objects in an Expanding Science Ecosystem

Posted by & filed under Event, Presentations.

On 2017-11-15, Carole Goble presented Research Objects: more than the sum of the parts at the Research Data Alliance workshop Managing Digital Research Objects in an Expanding Science Ecosystem in Bethesda, US.

Other presentations at the workshop include Julie McMurry on Identifiers for the 21st Century, NISO‘s Todd Carpenter in Identify everything and DataCite‘s Patricia Cruce on Persistent Identifiers. Dave Vieglais showed how DataOne are Connecting Users with Digital Research Objects, while Jim Myers showed how the National Data Service‘s SEAD provides a An Ecosystem Approach to Data Services and Digital Research Objects

2017-11-27 BioCompute Objects

Posted by & filed under Event, News.

BioCompute Objects (BCO) is a community-driven project backed by the FDA (US Food and Drug Administration) and George Washington University to standardize exchange of High-Throughput-Sequencing workflows for regulatory submissions between FDA, pharma, bioinformatics platform providers and researchers.

BioCompute Objects

Members of the Research Object team (Carole Goble, Stian Soiland-Reyes, Michael R Crusoe) have been collaborating closely with the rest of the BCO community since 2016, in particular covering the integration of BCO with existing standards like Research Object, W3C PROV and Common Workflow Language.

Vahan Simonyan from FDA presented BioCompute Objects to the GA4GH Cloud workstream webinar on 2017-11-27.

The below blog post is based on an extract of Vahan’s slides modulated to cover my thoughts on the role of Research Objects with BCOs.

Read more »

2017-10-24 Revamped ROHub portal officially released

Posted by & filed under News.

The completely renovated ROHub portal, developed by the EVER-EST project, includes a new and modern design, improved performance, plus a set of new features focused on improving the user experience.

Screenshot of

The ROHub was also presented at IEEE eScience Conference, described in the paper Towards a Human-Machine Scientific Partnership Based on Semantically Rich Research Objects.

Read more »