Link Search Menu Expand Document

RO-Crate profiles

While RO-Crates can be considered general-purpose containers of arbitrary data and open-ended metadata, in practical use within a particular domain, application or framework, it will be beneficial to further constrain RO-Crate to a specific profile: a set of conventions, types and properties that one minimally can require and expect to be present in that subset of RO-Crates.

Defining and conforming to such a profile enables reliable programmatic consumption of an RO-Crate’s content, as well as consistent creation, e.g. a form in a user interface form firmly suggest the required types and properties, and likewise a rendering of an RO-Crate can easier make rich UI components if it can reliably assume for instance that the Person always has an affiliation to a Organization which has a url - a restriction that may not be appropriate for all types of RO-Crates.

As such RO-Crate Profiles can be considered a duck typing mechanism for RO-Crates, but also as a classifier to indicate the crate’s purpose, expectations and focus.

Publishing an RO-Crate profile

An RO-Crate profile is identified with a Profile URI.

Recommendations:

  • The profile URI MUST resolve to a human-readable profile description (e.g. a HTML web page)
    • The profile URI MAY have a corresponding machine-readable Profile Crate
  • The profile URI SHOULD be a permalink (persistent identifier)
  • The profile URI SHOULD be versioned with MAJOR.MINOR, e.g. http://example.com/image-profile-2.4
  • The profile description SHOULD use key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL as described in [RFC2119].

Suggestions:

Declaring conformance of an RO-Crate profile

RO-Crate can describe a profile by adding it as an contextual entity:

{
    "@id": "https://w3id.org/ro/profile/paradisec/0.1",
    "@type": "CreativeWork",
    "name": "Profile for RO-Crates for PARADISEC repository",
    "version": "0.1.0"
}

The contextual entity for a profile:

RO-Crates conforming to (or intending to conform to) such a profile SHOULD expand the conformsTo declaration of the metadata file descriptor to be an array and include the profile identifier:

{
    "@type": "CreativeWork",
    "@id": "ro-crate-metadata.json",
    "about": {"@id": "./"},
    "conformsTo": [
        {"@id": "https://w3id.org/ro/crate/1.2-DRAFT"},
        {"@id": "https://w3id.org/ro/profile/paradisec/0.1"}
    ]
}

It is valid for a crate to conform to multiple profiles.

Note that although profile conformance is declared on the RO-Crate Metadata File ro-crate-metadata.json, the profile applies to the whole RO-Crate, and may cover aspects beyond the crate’s JSON-LD serialization (e.g. identifiers, packaging, purpose).

Profile Crate

While the Profile URI @id can resolve to a human-readable profile description, it can additionally be made to resolve to a Profile Crate.

A Profile Crate is a type of RO-Crate that gathers resources which further define the profile. This allows formalizing alternative profile description for machine-readability, for instance for validation, but also additional resources like examples.

How to retrieve a Profile Crate

To resolve a Profile URI to a machine-readable Profile Crate, two approaches are recommended to retrieve its RO-Crate metadata file:

  1. HTTP Content-negotiation for the RO-Crate media type, for example:
    Requesting https://w3id.org/ro/profile/paradisec/0.1 with HTTP header
    Accept: application/ld+json;profile=https://w3id.org/ro/crate redirects to the RO-Crate Metadata file https://example.org/ro-profiles/paradisec-0.1.0/ro-crate-metadata.json
  2. The above approach may fail (or returns a HTML page), e.g. for content-delivery networks that do not support content-negotiation. The fallback is to try resolving the path ./ro-crate-metadata.json from the resolved URI (after permalink redirects). For example:
    If permalink https://w3id.org/ro/profile/paradisec/0.1 redirects to https://example.org/ro-profiles/paradisec-0.1.0/, then get https://example.org/ro-profiles/paradisec-0.1.0/ro-crate-metadata.json
  3. If none of these approaches worked, then this profile probably does not have a corresponding Profile Crate. For humans, display a hyperlink to its @id described by its name.

What is included in the Profile Crate?

Below follows the suggested data entities to include in a Profile Crate:

Profile description entity

A Profile Crate MUST declare a human-readable profile description, which is about this Profile Crate:

{
    "@id": "index.html",
    "@type": "File",
    "name": "PARADISEC profile description",
    "about": "./",
}

The profile description MAY be equivalent to the RO-Crate Website ro-crate-preview.html.

Profile Schema entity

An optional machine-readable schema of the profile, for instance a Describo JSON profile:

{
    "@id": "https://raw.githubusercontent.com/UTS-eResearch/describo/v0.13.0/src/components/profiles/paradisec.describo.profile.json",
    "@type": "File",
    "name": "PARADISEC profile for Describo",
    "encodingFormat": [
        "application/json", 
        {"@id": "https://github.com/UTS-eResearch/describo/wiki/dsp-index"}
    ]
},
{
    "@id": "https://github.com/UTS-eResearch/describo/wiki/dsp-index",
    "@type": "WebPage",
    "name": "Describo JSON profile"
}

A schema may formalize restrictions on the RO-Crate metadata file on a graph-level (e.g. what types/properties) as well as serialization level (e.g. use of JSON arrays).

Below are known schema types and their suggested encodingFormat identifiers:

Name Media Type URI
JSON Schema application/schema+json https://json-schema.org/draft/2020-12/schema
Describo application/json https://github.com/UTS-eResearch/describo/wiki/dsp-index
CheckMyCrate application/json https://github.com/KockataEPich/CheckMyCrate#profiles
SHACL text/turtle https://www.w3.org/TR/shacl/
ShEx text/shex http://shex.io/shex-semantics/
BagIt Profile application/json https://bagit-profiles.github.io/bagit-profiles-specification/

Some of the above schema languages are based on general data structure syntaxes like application/json and text/turtle, and therefore have a generic Media Type accompanied by a specialized URI.

Software that works with the profile

Software that may consume/validate/generate RO-Crates following this profile (potentially using the schema):

{
      "@id": "https://arkisto-platform.github.io/describo/",
      "@type": "SoftwareApplication",
      "name": "Describo",
      "version": "0.13.0",
      "url": "https://arkisto-platform.github.io/describo/"
}

Repositories that expect the profile

A repository or collection within a repository that may accept/contain RO-Crates following this profile:

{
   "@id": "https://mod.paradisec.org.au/",
   "@type": "RepositoryCollection",
   "title":  "Modern PARADISEC demonstrator",   
   "description": "PARADISEC curates digital material about small or endangered languages",
   "publisher": {"@id": "https://paradisec.org.au/"}
}

BagIt packaging

If conforming RO-Crates should be packaged according to a BagIt profile (e.g. must be serialized as an application/zip):

{
   "@id": "https://w3id.org/ro/bagit/profile/0.3",
   "@type": "WebPage",
   "name":  "BagIt profile for RO-Crate in ZIP",
   "encodingFormat": [
        "application/json", 
        {"@id": "https://bagit-profiles.github.io/bagit-profiles-specification/"}
   ]   
}

Extension vocabularies

A profile that extends RO-Crate SHOULD indicate which vocabulary/ontology it uses as a DefinedTermSet:

{
    "@id": "https://w3id.org/ro/terms/test#",
    "@type": "DefinedTermSet",
    "name": "Namespace for workflow testing metadata",
    "url": "https://github.com/ResearchObject/ro-terms/tree/master/test",
}

The @id of the vocabulary SHOULD be the namespace, while url SHOULD go to a human-readable description of the vocabulary.

Extension terms

A profile that extends RO-Crate MAY indicate particular terms directly as DefinedTerm instances:

{
    "@id": "https://w3id.org/ro/terms/test#runsOn",
    "@type": "DefinedTerm",
    "termCode": "runsOn",
    "name": "Runs on",
    "description": "Service where the test instance is executed",
    "url": "https://lifemonitor.eu/workflow_testing_ro_crate#test-instance",
}

The termCode SHOULD be valid as a key in JSON-LD @context of conforming RO-Crates.

JSON-LD Context

A profile that have a corresponding JSON-LD @context (e.g. to map its extensions terms, or to suggest a version of RO-Crate’s official context) SHOULD indicate the context in the Profile Crate:

{
    "@id": "https://w3id.org/ro/crate/1.1/context",
    "@type": "CreativeWork",
    "name": "RO-Crate JSON-LD Context",
    "encodingFormat": [
        "application/ld+json",
        {"@id": "http://www.w3.org/ns/json-ld#Context"}
    ],
    "version": "1.1.1",
},
{
    "@id": "http://www.w3.org/ns/json-ld#Context",
    "@type": "Thing",
    "name": "JSON-LD Context",
    "url": "https://www.w3.org/TR/json-ld/"
}

The JSON-LD Context:

  • SHOULD have a permalink (persistent identifier) as @id
  • SHOULD use https rather than http with a certificate commonly accepted by browsers
  • SHOULD have a @id URI that is versioned with MAJOR.MINOR, e.g. https://example.com/image-profile-2.4
  • SHOULD have a descriptive name
  • SHOULD have a encodingFormat to the contextual entity http://www.w3.org/ns/json-ld#Context
  • MAY declare version according to Semantic Versioning

Note that the referenced context URI does not have to match the @context of the Profile Crate itself.

The @context MAY be the Profile Crate’s Metadata JSON-LD file if it is resolvable as media type application/ld+json over HTTP.