Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Introduction

This document specifies a method, known as RO-Crate (Research Object Crate), of aggregating and describing research data with associated metadata. RO-Crates can aggregate and describe any resource including files, URI-addressable resources, or use other addressing schemes to locate digital or physical data. RO-Crates can describe data both in aggregate and at the individual resource level, with metadata to aid in discovery, and for the re-use and long term management of data. Metadata provides the ability to describe the context of data and entities involved in its production, use and reuse. For example: who created it, using which equipment, software and workflows, under which license it can be re-used, where it was collected, and/or what it is about.

RO-Crate uses JSON-LD to express this metadata using linked data, describing data resources as well as contextual entities such as people, organizations, software and equipment as a series of linked JSON-LD objects - using common published vocabularies, chiefly schema.org.

The core of RO-Crate is a JSON-LD file, the RO-Crate Metadata File, named ro-crate-metadata.json. This file contains structured metadata about the dataset as a whole (the Root Data Entity) and, optionally, about some or all of its files. This provides a simple way to, for example, assert the authors (e.g. people, organizations) of the RO-Crate or one its files, or to capture more complex provenance for files, such as how they were created using software and equipment.

While providing the formal specification for RO-Crate, this document also aims to be a practical guide for software authors to create tools for generating and consuming research data packages, with explanation by examples.