Root Data Entity
Table of contents
The Root Data Entity is a Dataset that represents the RO-Crate as a whole; a Research Object that includes the Data Entities and the related Contextual Entities.
RO-Crate Metadata Descriptor
The RO-Crate Metadata Document MUST contain a self-describing
RO-Crate Metadata Descriptor with
the @id
value ro-crate-metadata.json
(or ro-crate-metadata.jsonld
in legacy
crates for RO-Crate 1.0 or older) and @type
CreativeWork. This descriptor MUST have an about property referencing the Root Data Entity’s @id
.
{ "@context": "https://w3id.org/ro/crate/1.2-DRAFT/context",
"@graph": [
{
"@type": "CreativeWork",
"@id": "ro-crate-metadata.json",
"about": {"@id": "./"},
"conformsTo": {"@id": "https://w3id.org/ro/crate/1.2-DRAFT"}
},
{
"@id": "./",
"@type": "Dataset",
...
}
]
}
ro-crate-metadata.json
MUST be used within the RO-Crate JSON-LD.The conformsTo of the RO-Crate Metadata Descriptor SHOULD have a single value which
is a versioned permalink URI of the RO-Crate specification
that the RO-Crate JSON-LD conforms to. The URI SHOULD
start with https://w3id.org/ro/crate/
.
conformsTo
MAY be an array and can include RO-Crate profiles in addition to the base specification. In version 1.2, it is now recommended that profile declarations are included on the Root Data Entity instead (see Direct Properties of the Root Data Entity).Finding the Root Data Entity
Consumers processing the RO-Crate as a JSON-LD graph can find the Root Data Entity by following this algorithm:
- For each entity in
@graph
array - .. if the
@id
isro-crate-metadata.json
- …. from this entity’s
about
object, keep the@id
URI as variable root - .. if the
@id
isro-crate-metadata.jsonld
- …. from this entity’s
about
object, keep the@id
URI as variable legacyroot - For each entity in
@graph
array - .. if the entity has an
@id
URI that matches a non-null root return it - For each entity in
@graph
array - .. if the entity has an
@id
URI that matches a non-null legacyroot return it - Fail with unknown root data entity.
Note that the above can be implemented efficiently by first building a map (entity_map
) of
all entities using their @id
as keys (which is typically also helpful for
further processing) and then performing a series of lookups.
Ignoring the legacy case for now, this lookup code could be:
metadata_entity = entity_map["ro-crate-metadata.json"]
root_entity = entity_map[metadata_entity["about"]["@id"]]
See also the appendix on finding RO-Crate Root in RDF triple stores.
Purpose of Metadata Document
To ensure a base-line interoperability between RO-Crates, a minimum set of metadata is required for the Root Data Entity. As stated earlier the RO-Crate Metadata Document is not an exhaustive manifest or inventory, that is, it does not necessarily list or describe all files in the package. For this reason, there are no minimum metadata requirements in terms of describing Data Entities (files and folders) other than the Root Data Entity. Extensions of RO-Crate dealing with specific types of dataset may apply further constraints or requirements of metadata beyond the Root Data Entity (see the appendix Extending RO-Crate).
The RO-Crate Metadata Descriptor MAY contain information such as licensing for the RO-Crate Metadata Document if metadata is licensed separately from the crate’s Data entities.
The section below outlines the properties that the Root Data Entity MUST have.
Direct properties of the Root Data Entity
The Root Data Entity MUST have all of the properties listed below. Each property also has requirements that apply to its value:
@type
: MUST be Dataset or an array that containsDataset
@id
: SHOULD be the string./
or an absolute URI (see below)name
: SHOULD identify the dataset to humans well enough to disambiguate it from other RO-Cratesdescription
: SHOULD further elaborate on the name to provide a summary of the context in which the dataset is important.datePublished
: MUST be a single string value in ISO 8601 date format, SHOULD be specified to at least the precision of a day, and MAY be a timestamp down to the millisecond.license
: SHOULD link to a Contextual Entity or Data Entity in the RO-Crate Metadata Document with a name and description (see section on licensing). MAY, if necessary, be a textual description of how the RO-Crate may be used.
Dataset
to have a name
and description
.Additional properties of schema.org types Dataset and CreativeWork MAY be added to further describe the RO-Crate as a whole, e.g. author, abstract, publisher. See sections contextual entities and provenance for further details.
If the RO-Crate conforms to one or more profiles, this should be described following the guidance in the section Declaring conformance of an RO-Crate profile.
Root Data Entity identifier
The Root Data Entity’s @id
SHOULD be either ./
(indicating the directory of ro-crate-metadata.json
is the RO-Crate Root), or an absolute URI.
PropertyValue
or MAY use a full persistent URL as the @id
for the Root Data Entity.identifier
to be plain string URIs. Clients SHOULD be permissive of an RO-Crate identifier
being a string (which MAY be a URI), or a @id
reference, which SHOULD be represented as a PropertyValue
entity which MUST have a human readable value
, and SHOULD have a url
if the identifier is Web-resolvable. A citable representation of this persistent identifier MAY be given as a description
of the PropertyValue
, but as there are more than 10.000 known citation styles, no attempt should be made to parse this string.Resolvable persistent identifiers and citation text
It is RECOMMENDED that resolving the identifier
programmatically returns the RO-Crate Metadata Document or an archive (e.g. ZIP) that contains the RO-Crate Metadata File, using content negotiation and/or Signposting. With an RO-Crate identifier that is persistent and resolvable in this way from a URI, the Root Data Entity SHOULD indicate this using the cite-as
property according to RFC8574. Likewise, an HTTP/HTTPS server of the resolved RO-Crate Metadata Document or archive (possibly after redirection) SHOULD indicate that persistent identifier in its Signposting headers using Link rel="cite-as"
.
cite-as
MAY go to a repository landing page, and MAY require authentication, but MUST ultimately have the RO-Crate as a downloadable item, which SHOULD be programmatically accessible through content negotiation or Signposting (Link rel="describedby"
for an RO-Crate Metadata Document, or Link rel="item"
for an archive). To rather associate a textual scholarly citation for a crate (e.g. journal article), indicate instead a publication via citation
property.Any entity which is a subclass of CreativeWork, including Datasets like the Root Data Entity, MAY have a creditText
property which provides a textual citation for the entity.
Minimal example of RO-Crate
The following RO-Crate Metadata Document represents a minimal description of an RO-Crate that also follows the identifier recommendations above for use in an Attached RO-Crate Package.
{ "@context": "https://w3id.org/ro/crate/1.2-DRAFT/context",
"@graph": [
{
"@id": "ro-crate-metadata.json",
"@type": "CreativeWork",
"about": {"@id": "./"},
"conformsTo": {"@id": "https://w3id.org/ro/crate/1.2-DRAFT"}
},
{
"@id": "./",
"@type": "Dataset",
"identifier": {"@id": "https://doi.org/10.4225/59/59672c09f4a4b"},
"cite-as": "https://doi.org/10.4225/59/59672c09f4a4b",
"datePublished": "2017",
"name": "Data files associated with the manuscript:Effects of facilitated family case conferencing for ...",
"description": "Palliative care planning for nursing home residents with advanced dementia ...",
"license": {"@id": "https://creativecommons.org/licenses/by-nc-sa/3.0/au/"},
"creditText": "Agar, M. et al., 2017. Data supporting \"Effects of facilitated family case conferencing for advanced dementia: A cluster randomised clinical trial\". https://doi.org/10.4225/59/59672c09f4a4b"
},
{
"@id": "https://creativecommons.org/licenses/by-nc-sa/3.0/au/",
"@type": "CreativeWork",
"description": "This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/au/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.",
"identifier": "https://creativecommons.org/licenses/by-nc-sa/3.0/au/",
"name": "Attribution-NonCommercial-ShareAlike 3.0 Australia (CC BY-NC-SA 3.0 AU)"
},
{
"@id": "https://doi.org/10.4225/59/59672c09f4a4b",
"@type": "PropertyValue",
"propertyID": "https://registry.identifiers.org/registry/doi",
"value": "doi:10.4225/59/59672c09f4a4b",
"url": "https://doi.org/10.4225/59/59672c09f4a4b"
}
]
}
Alternatively the following is also valid, this time using the DOI as the @id
of the Root Data Entity:
{
"@id": "ro-crate-metadata.json",
"@type": "CreativeWork",
"about": {"@id": "https://doi.org/10.4225/59/59672c09f4a4b"},
"conformsTo": {"@id": "https://w3id.org/ro/crate/1.2-DRAFT"}
},
{
"@id": "https://doi.org/10.4225/59/59672c09f4a4b",
"@type": "Dataset",
"cite-as": "https://doi.org/10.4225/59/59672c09f4a4b",
"datePublished": "2017",
"name": "Data files associated with the manuscript:Effects of facilitated family case conferencing for ...",
"description": "Palliative care planning for nursing home residents with advanced dementia ...",
"license": {"@id": "https://creativecommons.org/licenses/by-nc-sa/3.0/au/"},
"creditText": "Agar, M. et al., 2017. Data supporting \"Effects of facilitated family case conferencing for advanced dementia: A cluster randomised clinical trial\". https://doi.org/10.4225/59/59672c09f4a4b"
}