Skip to content Skip to footer

Data Entities

Table of contents

  1. Referencing files and folders from the Root Data Entity
  2. Encoding file paths in @ids
  3. Example Attached RO-Crate Package
  4. Core Metadata for Data Entities
    1. File Data Entity
    2. Directory Data Entity
    3. Web-based Data Entities
    4. Data entities in an Attached RO-Crate that are also on the web
    5. Directories on the web; dataset distributions
      1. Downloadable dataset
  5. Adding detailed descriptions of File encodings
  6. File format profiles
  7. Referencing other RO-Crates
    1. Referencing RO-Crates that have a persistent identifier
    2. Determining entity identifier for a referenced RO-Crate
    3. Referencing another metadata document
    4. Profiles of referenced crates
    5. Retrieving an RO-Crate

The primary purpose for RO-Crate is to gather and describe a set of Data Entities in the form of:

  • Files which are datastreams available on the local file system or over the web
  • Directories

The Data Entities can be further described by referencing contextual entities such as persons, organizations and publications.

Referencing files and folders from the Root Data Entity

Where files and folders are represented as Data Entities in the RO-Crate JSON-LD, these MUST be linked to, either directly or indirectly, from the Root Data Entity using the hasPart property. Directory hierarchies MAY be represented with nested Dataset Data Entities, or the Root Data Entity MAY refer to files anywhere in the hierarchy using hasPart.

Data Entities representing files MUST have "File" as a value for @type. File is an RO-Crate alias for http://schema.org/MediaObject. The term File includes:

  • Resources which are available locally (applicable only in the context of Attached RO-Crate Packages) and
  • Web-based Data Entities which can be downloaded and saved as a file.

The rules for the @id property of Files are set out below.

In all cases, @type MAY be an array to also specify a more specific type, e.g. "@type": ["File", "ComputationalWorkflow"]

There is no requirement to represent every file and folder in an RO-Crate as Data Entities in the RO-Crate JSON-LD. Reasons for not describing files would include that the files:

  • are described in some other way, for example a manifest or another package management system,
  • are supporting files for a software application,
  • have metadata embedded in their filenames or paths which can be documented whithout having to describe every file,
  • have a purpose that is unknown to the crate author, but they need to be preserved as part of an archive.

In any of the above cases where files are not described, a directory containing a set of files MAY be described using a Dataset Data Entity that encapsulates the files with a description property that explains the contents. If the RO-Crate file structure is flat, or files are not grouped together, a description property on the Root Data Entity may be used, or a Dataset with a local reference beginning with # (e.g. to describe a certain type of file which occurs throughout the crate). This approach is recommended for RO-Crates which are to be deposited in a long-term archive.

Encoding file paths in @ids

Note that all @id identifiers must be valid URI references. Care must be taken to express any relative paths using / separator, correct casing, and escape special characters like space (%20) and percent (%25), for instance a File Data Entity from the Windows path Results and Diagrams\almost-50%.png becomes "@id": "Results%20and%20Diagrams/almost-50%25.png" in the RO-Crate JSON-LD.

In this document the term URI includes international IRIs; the RO-Crate Metadata Document is always UTF-8 and international characters in identifiers SHOULD be written using native UTF-8 characters (IRIs), however traditional URL encoding of Unicode characters with % MAY appear in @id strings. Example: "@id": "面试.mp4" is preferred over the equivalent "@id": "%E9%9D%A2%E8%AF%95.mp4"

Example Attached RO-Crate Package

The following is an example of an Attached RO-Crate Package linking to a file and folders.

The file layout of the example is:

<RO-Crate root>/
  |   ro-crate-metadata.json
  |   cp7glop.ai
  |   lots_of_little_files/
  |    | file1
  |    | file2
  |    | ...
  |    | file54

An example RO-Crate JSON-LD for the above would be as follows.

This example contains both a File Data Entity and a Directory Data Entity.

{ "@context": "https://w3id.org/ro/crate/1.2-DRAFT/context",
  "@graph": [
    {
      "@type": "CreativeWork",
      "@id": "ro-crate-metadata.json",
      "conformsTo": {"@id": "https://w3id.org/ro/crate/1.2-DRAFT"},
      "about": {"@id": "./"}
    },  
    {
      "@id": "./",
      "@type": [
        "Dataset"
      ],
      "name": "Example Dataset",
      "datePublished": "2016-02-01",
      "author": "https://orcid.org/0000-0003-4953-0830",
      "license": "CC-BY",
      "hasPart": [
        {
          "@id": "cp7glop.ai"
        },
        {
          "@id": "lots_of_little_files/"
        }
      ]
    },
    {
      "@id": "cp7glop.ai",
      "@type": "File",
      "name": "Diagram showing trend to increase",
      "contentSize": "383766",
      "description": "Illustrator file for Glop Pot",
      "encodingFormat": "application/pdf"
    },
    {
      "@id": "lots_of_little_files/",
      "@type": "Dataset",
      "name": "Too many files",
      "description": "This directory contains many small files - the name of the file is a date in YYYY-MM-DD.csv, each file contains daily temperature readings, sampled hourly for the Glop Pot cave."
    },
    {
      "@id": "https://orcid.org/0000-0003-4953-0830",
      "@type": "Person",
      "name": "Michael Lake",
    }
  ]
}

Core Metadata for Data Entities

File Data Entity

A File Data Entity MUST have the following properties:

  • @type: MUST be File, or an array where File is one of the values.
  • @id: MUST be a relative or absolute URI.

Further constraints on the @id are dependent on whether the File entity is being considered as part of an Attached RO-Crate Package or Detached RO-Crate Package.

  1. For an Attached RO-Crate Package:
    • @id MUST be one of either: a. A relative URI, indicating that a file MUST be present at the path @id relative to the RO-Crate Root. b. An absolute URI, indicating that the entity is a Web-based Data Entity.

    A File in an Attached RO-Crate Package MAY have also a contentURL property which corresponds to a download link for the file. Following the link (allowing for HTTP redirects) SHOULD directly download the file.

  2. For a Detached RO-Crate Package @id MUST be an Absolute URI; all File Data Entities are Web-based Data Entities.

Additionally, File entities SHOULD have:

  • name giving a human readable name (not necessarily the filename)
  • description giving a longer description, e.g. the role of this file within this crate
  • encodingFormat indicating the the IANA media type as a string (e.g. `“text/plain”) and/or a reference to file format contextual entity.
  • conformsTo to a contextual entity of type Profile, that indicate a profile of the encoding format, if applicable
  • contentSize with the size of the file in bytes

RO-Crate’s File is an alias for schema.org type MediaObject, any of its properties MAY also be used (adding contextual entities as needed). Files on the web SHOULD also use identifier, url, subjectOf, and/or mainEntityOfPage.

Directory Data Entity

A Dataset (directory) Data Entity MUST have the following properties:

  • @type MUST be Dataset or an array where Dataset is one of the values.
  • @id MUST be either:
    • In an Attached RO-Crate Package ONLY - a URI Path that SHOULD end with /. This MUST resolve to a directory which is present in the RO-Crate Root along with its parent directories.
    • An absolute URI which SHOULD resolve to a programmatic listing of the content of the “directory” (e.g. another RO-Crate).
    • A local reference beginning with #

For a Detached RO-Crate Package:

  • The localPath property MAY be used to indicate the directory path to use when converting from a Detached to an Attached RO-Crate Package.

Additionally, Dataset entities SHOULD have:

  • name giving a human readable name (not necessarily the directory name)
  • description giving a longer description, e.g. the content of this directory
  • hasPart listing directly contained Data Entities

Any of the properties of schema.org Dataset MAY additionally be used (adding contextual entities as needed). Directories on the web SHOULD also provide distribution.

If the dataset contained a large number of *.ai files which were spread throughout the crate structure and which did not have File Data Entities then a approach to describing them would be:

{
    "@id": "./",
    "@type": [
        "Dataset"
    ],
    "hasPart": [
        {
            "@id": "#ai-files"
        }
    ],
},
{
    "@id": "#ai-files",
    "@type": "Dataset",
    "name": ".ai Files",
    "description": "This dataset contains some files with the extension '.ai' which despite their extension have an encoding format of 'application/pdf'. These have yet to be catalogued."
}

Web-based Data Entities

Using Web-based Data Entities can be important particularly where a file can’t be included in the RO-Crate Root because of licensing concerns, large data sizes, privacy, or where it is desirable to link to the latest online version.

Example of an RO-Crate including a File Data Entity external to the RO-Crate Root (file entity https://zenodo.org/record/3541888/files/ro-crate-1.0.0.pdf):

{ "@context": "https://w3id.org/ro/crate/1.2-DRAFT/context",
  "@graph": [
    {
        "@type": "CreativeWork",
        "@id": "ro-crate-metadata.json",
        "conformsTo": {"@id": "https://w3id.org/ro/crate/1.2-DRAFT"},
        "about": {"@id": "./"}
  },  
  {
    "@id": "./",
    "@type": [
      "Dataset"
    ],
    "hasPart": [
      {
        "@id": "survey-responses-2019.csv"
      },
      {
        "@id": "https://zenodo.org/record/3541888/files/ro-crate-1.0.0.pdf"
      }
    ]
  },
  {
    "@id": "survey-responses-2019.csv",
    "@type": "File",
    "name": "Survey responses",
    "contentSize": "26452",
    "encodingFormat": "text/csv"
  },
  {
    "@id": "https://zenodo.org/record/3541888/files/ro-crate-1.0.0.pdf",
    "@type": "File",
    "name": "RO-Crate specification",
    "contentSize": "310691",
    "description": "RO-Crate specification",
    "encodingFormat": "application/pdf"
  }
]
}

Additional care SHOULD be taken to improve persistence and long-term preservation of web resources included in an RO-Crate, as they can be more difficult to archive or move along with the RO-Crate Root, and may change intentionally or unintentionally, leaving the RO-Crate with incomplete or outdated information.

File Data Entities with an @id URI outside the RO-Crate Root SHOULD at the time of RO-Crate creation be directly downloadable by a simple non-interactive retrieval (e.g. HTTP GET) of a single data stream, permitting redirections and HTTP/HTTPS authentication. For instance, in the example above, https://zenodo.org/record/3541888 and https://doi.org/10.5281/zenodo.3541888 cannot be used as @id as retrieving these URLs gives a HTML landing page rather than the desired PDF as indicated by encodingFormat.

As files on the web may change, the timestamp property sdDatePublished SHOULD be included to indicate when the absolute URL was accessed, and derived metadata like encodingFormat and contentSize were considered to be representative:

  {
    "@id": "https://zenodo.org/record/3541888/files/ro-crate-1.0.0.pdf",
    "@type": "File",
    "name": "RO-Crate specification",
    "contentSize": "310691",
    "encodingFormat": "application/pdf",
    "sdDatePublished": "2020-04-09T13:09:21+01:00Z"
  }

Web-based entities MAY use the property localPath to indicate a path that can be used to when downloading the data in an Attached RO-Crate Package context. This may be used to instantiate local copies of web-based resources in an Attached RO-Crate Package or as part of a process to download local resources from a Detached RO-Crate Package relative to a new root directory.

  {
    "@id": "https://zenodo.org/record/3541888/files/ro-crate-1.0.0.pdf",
    "localPath": "docs/ro-crate-1.0.0.pdf",
    "@type": "File",
    "name": "RO-Crate specification",
    "contentSize": "310691",
    "encodingFormat": "application/pdf",
    "sdDatePublished": "2020-04-09T13:09:21+01:00Z"
  }

Data entities in an Attached RO-Crate that are also on the web

File Data Entities that are present as local files may already have a corresponding web presence, for instance a landing page that describes the file, including persistent identifiers (e.g. DOI) resolving to an intermediate HTML page instead of the downloadable file directly.

These MAY be included for File Data Entities as additional metadata, regardless of whether the File is included in the RO-Crate Root directory or exists on the Web, by using the properties:

  • identifier for formal identifier strings such as DOIs
  • contentUrl with a string URL corresponding to a download link. Following the link (allowing for HTTP redirects) SHOULD directly download the file.
  • url with a string URL for a download/landing page for this particular file (e.g. direct download is not available)
  • subjectOf to a CreativeWork (or WebPage) that mentions this file or its content (but also other resources)
  • mainEntityOfPage to a CreativeWork (or WebPage) that primarily describes this file (or its content)

Note that if a local file is intended to be packaged within an Attached RO-Crate Package, the @id property MUST be a URI Path relative to the RO Crate Root, for example survey-responses-2019.csv as in the example below, where the contentUrl points to a download endpoint as a string.

  {
    "@id": "survey-responses-2019.csv",
    "@type": "File",
    "name": "Survey responses",
    "encodingFormat": "text/csv",
    "contentUrl": "http://example.com/downloads/2019/survey-responses-2019.csv",
    "subjectOf": {"@id": "http://example.com/reports/2019/annual-survey.html"}
  },
  {
    "@id": "http://example.com/reports/2019/annual-survey.html",
    "@type": "WebPage",
    "name": "Survey responses (landing page)"
  }

Directories on the web; dataset distributions

A Directory File Entity or Dataset identifier expressed as an absolute URL on the web can be harder to download than a File because it consists of multiple resources. It is RECOMMENDED that such directories have a complete listing of their content in hasPart, enabling download traversal, or are themselves RO-Crates (see Referencing other RO-Crates).

Downloadable dataset

Alternatively, a common mechanism to provide downloads of a reasonably sized directory is as an archive file in formats such as application/zip or application/gzip, described as a DataDownload.

  {
    "@id": "lots_of_little_files/",
    "@type": "Dataset",
    "name": "Too many files",
    "description": "This directory contains many small files, that we're not going to describe in detail.",
    "distribution": {"@id": "http://example.com/downloads/2020/lots_of_little_files.zip"}
  },
  {
    "@id": "http://example.com/downloads/2020/lots_of_little_files.zip",
    "@type": "DataDownload",
    "encodingFormat": ["application/zip", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263"}],
    "contentSize": "82818928"
  }

Similarly, the RO-Crate Root entity (or a reference to another RO-Crate as a Dataset) may provide a distribution URL, in which case the download SHOULD be an archive that contains the RO-Crate Metadata Document (either directly in the archive’s root, or within a single folder in the archive), indicated by a version-less conformsTo:

  {
    "@id": "./",
    "@type": "Dataset",
    "identifier": "https://doi.org/10.48546/workflowhub.workflow.775.1",
    "name": "Research Object Crate for Jupyter Notebook Molecular Structure Checking",
    "distribution": {"@id": "https://workflowhub.eu/workflows/775/ro_crate?version=1"},
    "…": ""
  },
  {
    "@id": "https://workflowhub.eu/workflows/775/ro_crate?version=1",
    "@type": "DataDownload",
    "encodingFormat": ["application/zip", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263"}],
    "conformsTo": { "@id": "https://w3id.org/ro/crate" }
  }

In all cases, consumers should be aware that a DataDownload is a snapshot that may not reflect the current state of the Dataset or RO-Crate.

Adding detailed descriptions of File encodings

The above example provides a media type for the file cp7glop.ai - which is useful as it may not be apparent that the file is readable as a PDF file from the extension alone. To add more detail, encodings SHOULD be linked using a PRONOM identifier to a Contextual Entity with @type array containing WebPage and Standard.

  {
    "@id": "cp7glop.ai",
    "@type": "File",
    "name": "Glop Plot map",
    "contentSize": "383766",
    "description": "Illustrator file for Glop Pot",
    "encodingFormat": ["application/pdf", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/fmt/19"}]
  },
  {
    "@id": "https://www.nationalarchives.gov.uk/PRONOM/fmt/19",
    "name": "Acrobat PDF 1.5 - Portable Document Format",
    "@type": ["WebPage", "Standard"]
  }

If there is no PRONOM identifier (and typically no media type string), then a contextual entity with a different URL as an @id MAY be used, e.g. documentation page of a software’s file format. The contextual entity SHOULD NOT include Standard in its @type if the page does not sufficiently document the format. The @type SHOULD include WebPage, or MAY include WebPageElement to indicate a section of the page.

For example, .trr is a an internal GROMACS file format that is not further documented as a standard, but is referenced from a WebPageElement adressable by an #anchor:

 {
    "@id": "traj.trr",
    "@type": "File",
    "name": "Trajectory",
    "description": "Trajectory of molecular dynamics simulation using GROMACS",
    "contentSize": "45512",
    "encodingFormat": {"@id": "https://manual.gromacs.org/documentation/2021/reference-manual/file-formats.html#trr"}
  },
  {
    "@id": "https://manual.gromacs.org/documentation/2021/reference-manual/file-formats.html#trr",
    "@type": "WebPageElement",
    "name": "GROMACS trajectory of a simulation (trr)"
  }

If there is no web-accessible description for a file format it SHOULD be described locally in the RO-Crate, for example in a Markdown file:

 {
    "@id": "some-file.some_extension",
    "@type": "File",
    "name": "Some file",
    "description": "A file in a non-standard format",
    "contentSize": "120",
    "encodingFormat": ["text/plain", {"@id": "some_extension.md"}]
  },
  {
    "@id": "some_extension.md",
    "@type": ["File", "CreativeWork"],
    "name": "Description of some_extension text-based file format",
    "encodingFormat": "text/markdown"
  }

File format profiles

Some generic file formats like application/json may be specialized using a profile document that defines expectations for the file’s content as expected by some applications, by using conformsTo to a contextual entity with types CreativeWork and Profile:

 { 
  "@id": "attributes.csv",
  "@type": "File",
  "encodingFormat": ["text/csv", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/x-fmt/18"}],
  "conformsTo": {"@id": "https://docs.ropensci.org/dataspice/#create-spice"}
 },
 {
  "@id": "https://docs.ropensci.org/dataspice/#create-spice",
  "@type": ["CreativeWork", "Profile"],
  "name": "dataspice CSV profile"
 }

Referencing other RO-Crates

A referenced RO-Crate is also a Dataset Data Entity, but where its hasPart does not need to be listed. Instead, its content and further metadata is available from its own RO-Crate Metadata Document, which may be retrieved or packaged within an archive. An entity representing a referenced RO-Crate SHOULD have conformsTo pointing to the generic RO-Crate profile using the fixed URI https://w3id.org/ro/crate.

This section defines how a referencing RO-Crate (“A”) can declare Data Entities within A’s RO-Crate Metadata Document, in order to indicate a referenced RO-Crate (“B”). There are different options on how to find the identifier to assign to B in A, and how a consumer of A finding such a reference can find the corresponding RO-Crate Metadata Document for B.

Referencing RO-Crates that have a persistent identifier

If the referenced RO-Crate B has an identifier declared as B’s Root Data Entity identifier, then this is a persistent identifier which SHOULD be used as the URI in the @id of the corresponding entity in RO-Crate A. For instance, if RO-Crate B had declared the identifier https://pid.example.com/another-crate/ then RO-Crate A can reference B as an entity:

{
  "@id": "https://pid.example.com/another-crate/",
  "@type": "Dataset",
  "conformsTo": { "@id": "https://w3id.org/ro/crate" }
}

Consumers that find a reference to a Dataset with the generic RO-Crate profile indicated MAY attempt to resolve the persistent identifier, but SHOULD NOT assume that the @id directly resolves to an RO-Crate Metadata Document. See section Retrieving an RO-Crate below for the recommended algorithm.

If an identifier is not declared in a referenced RO-Crate B, follow the steps in Determining entity identifier for a referenced RO-Crate instead.

Determining entity identifier for a referenced RO-Crate

In some cases, if the referenced RO-Crate B has not got a resolvable identifier declared, additional steps are needed to find the correct @id to use:

  1. If RO-Crate A is an Attached RO-Crate Package and RO-Crate B is a nested folder within A (e.g. another-crate/), then B SHOULD be treated as an Attached RO-Crate Package (e.g. it has another-crate/ro-crate-metadata.json) and the relative path (another-crate/) SHOULD be used directly as @id of a Directory Data Entity within RO-Crate A.
  2. If B’s Root Data Entity has an @id that is an absolute URI, and that URI resolves according to Retrieving an RO-Crate, then that can be used as the @id of the Dataset entity in A, equivalent to the identifier case above.
    1. If the absolute URI has Signposting declared for a Link: with rel=cite-as, then that link MAY be considered as an equivalent permalink for B and no further properties are needed.
    2. Otherwise, as the URI was not declared as a persistent identifier, the timestamp property sdDatePublished SHOULD be included to indicate when the absolute URI was accessed.
  3. If B’s RO-Crate Metadata Document was located on the Web, but uses a relative URI reference for its Root Data Entity (./), then its absolute URI can be determined from the RFC 3986 algorithm for establishing a base URI. For example, if root {"@id": "./" } is in metadata document http://example.com/another-crate/ro-crate-metadata.json, then the absolute URI for the Dataset entity is http://example.com/another-crate/ (with the trailing /). If that URI is resolvable as in point 2, it can be used as equivalent @id (with sdDatePublished declared if necessary). It is NOT RECOMMENDED to resolve a relative root identifier if the metadata document was retrieved from a URI that does not end with /ro-crate-metadata.json, /*-ro-crate-metadata.json or /ro-crate-metadata.jsonld – these are not part of a valid Attached or Detached RO-Crate Package.
  4. If RO-Crate B is not on the Web, and does not have a persistent identifier, e.g. is within a ZIP file or local file system, then a non-resolvable identifier could be established. See appendix Establishing a base URI inside a ZIP file, e.g. arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/ if using a randomly generated UUID. This method may also be used if the above steps fail for an RO-Crate Metadata Document that is on the Web. In this case, the referenced RO-Crate entity MUST either declare a referenced metadata document or distribution.

Referencing another metadata document

If a referenced RO-Crate Metadata Document is known at a given URI or path, but its corresponding RO-Crate identifier can’t be determined as above (e.g. Retrieving an RO-Crate fails or requires heuristics), then a referenced metadata descriptor entity SHOULD be added. For instance, if http://example.com/another-crate/ro-crate-metadata.json resolves to an RO-Crate Metadata Document describing root ./, but http://example.com/another-crate/ always returns a HTML page without Signposting to the metadata document, then subjectOf SHOULD be added to an explicit metadata descriptor entity, which has encodingFormat declared for JSON-LD:

{
  "@id": "http://example.com/another-crate/",
  "@type": "Dataset",
  "conformsTo": { "@id": "https://w3id.org/ro/crate" },
  "subjectOf": { "@id": "http://example.com/another-crate/ro-crate-metadata.json" }
},
{
  "@id": "http://example.com/another-crate/ro-crate-metadata.json",
  "@type": "CreativeWork",
  "encodingFormat": "application/ld+json",
  "sdDatePublished": "2024-08-22T23:57:03+01:00"
}

Profiles of referenced crates

If the referenced crate conforms to a given RO-Crate profile, this MAY be indicated by expanding conformsTo on the Dataset to an array to reference the profile as a contextual entity:

{
  "@id": "https://doi.org/10.48546/workflowhub.workflow.26.1",
  "@type": "Dataset",
  "conformsTo": [
    { "@id": "https://w3id.org/ro/crate" },
    { "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"}
  ]
},
{ "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0",
  "@type": ["CreativeWork", "Profile"],
  "name": "Workflow RO-Crate Profile",
  "version": "1.0"
}

Retrieving an RO-Crate

To resolve a reference to an RO-Crate, but where subjectOf or distribution is unknown (e.g. an RO-Crate is cited from a journal article), the below approach is recommended to retrieve its RO-Crate Metadata Document:

  1. Assuming the URI is a permalink, after following HTTP redirects without content negotiation, try Signposting to look for Link headers that reference Link rel="describedby" for an RO-Crate Metadata Document, or Link rel="item" for a distribution archive – in either case prefer a link with profile="https://w3id.org/ro/crate" declared. For example, signposting for https://doi.org/10.48546/workflowhub.workflow.120.5 leads to the archive https://workflowhub.eu/workflows/120/ro_crate?version=5 as:

     curl --location --head https://doi.org/10.48546/workflowhub.workflow.120.5
    
     HTTP/2 302
     Location: https://workflowhub.eu/workflows/120?version=5
    
     HTTP/2 200
     Content-Type: text/html; charset=UTF-8
     Link: <https://workflowhub.eu/workflows/120/ro_crate?version=5> ;
           rel="item" ; type="application/zip" ;
           profile="https://w3id.org/ro/crate"
    
  2. HTTP Content-negotiation for the RO-Crate media type, for example:

    Requesting https://w3id.org/workflowhub/workflow-ro-crate/1.0 with HTTP header Accept: application/ld+json;profile=https://w3id.org/ro/crate redirects to the RO-Crate Metadata file https://about.workflowhub.eu/Workflow-RO-Crate/1.0/ro-crate-metadata.json

  3. The above approaches may fail or return a HTML page, e.g. for content-delivery networks that do not support content-negotiation.
  4. An optional heuristic fallback is to try resolving the path ./ro-crate-metadata.json from the resolved URI (after permalink redirects). For example:
    If permalink https://w3id.org/workflowhub/workflow-ro-crate/1.0 redirects to https://about.workflowhub.eu/Workflow-RO-Crate/1.0/index.html (a HTML page), then try retrieving https://about.workflowhub.eu/Workflow-RO-Crate/1.0/ro-crate-metadata.json.
  5. If the retrieved resource is a ZIP file (Content-Type: application/zip), then extract ro-crate-metadata.json, or, if the archive root only contains a single folder (e.g. folder1/), extract folder1/ro-crate-metadata.json
  6. If the retrieved resource is a BagIt archive, e.g. containing a single folder folder1 with folder1/bagit.txt, then extract and verify BagIt checksums before returning the bag’s data/ro-crate-metadata.json
  7. If the returned/extracted document is valid JSON-LD and has a Root Data Entity, this is the RO-Crate Metadata File.