Workflow Run Crate
- Version: 0.1
- Permalink: https://w3id.org/ro/wfrun/workflow/0.1
- Authors: Workflow Run RO-Crate working group
This profile uses terminology from the RO-Crate 1.1 specification.
Overview
This profile is used to describe the execution of a computational tool that has orchestrated the execution of other tools. Such a tool is represented as a workflow that can be executed using a workflow engine (e.g. cwltool).
This profile is a combination of Process Run Crate and Workflow RO-Crate. The entity referenced by the action’s instrument
(which represents the software application that’s been run) MUST be a ComputationalWorkflow
that is further described according to the Workflow RO-Crate requirements. In particular, it MUST be the mainEntity of the RO-Crate. The crate SHOULD have only one CreateAction
corresponding to the workflow’s execution. Details regarding the execution of individual workflow steps can be described with the Provenance Run Crate profile.
Workflows can have multiple input and output parameter slots that have to be mapped to actual files, directories or other values (e.g., a string or a number) before they can be executed. It is OPTIONAL to define such entities for a ComputationalWorkflow
. If included, parameter definitions MUST be provided as FormalParameter entities and referenced from the ComputationalWorkflow
via input
and output
(see the Bioschemas ComputationalWorkflow profile).
A data entity or PropertyValue
that realizes a FormalParameter
definition SHOULD refer to it via exampleOfWork; additionally, if the data entity or PropertyValue
is an illustrative example of the parameter, the latter MAY refer back to the former using the reverse property workExample. This links the input
of a ComputationalWorkflow
to the object
of a CreateAction
, and the output
of a ComputationalWorkflow
to the result
of a CreateAction
. An object
item that does not match a slot in the workflow’s input interface (e.g., a configuration file read from a predefined path) MUST NOT refer to a FormalParameter
of the ComputationalWorkflow
via exampleOfWork
. A FormalParameter
that maps to a PropertyValue
SHOULD have a subclass of DataType (e.g., Integer) — or PropertyValue, in the case of dictionary-like structured types — as its additionalType
. See CWL parameter mapping for an example.
Additional properties described in the Bioschemas FormalParameter profile (e.g., defaultValue
) MAY be used to provide additional information, but strict conformance is not required. A FormalParameter
definition that strictly conforms to the Bioschemas profile SHOULD reference the relevant versioned URL via conformsTo
.
Example Metadata File (ro-crate-metadata.json
)
{ "@context": "https://w3id.org/ro/crate/1.1/context",
"@graph": [
{
"@id": "ro-crate-metadata.json",
"@type": "CreativeWork",
"about": {"@id": "./"},
"conformsTo": [
{"@id": "https://w3id.org/ro/crate/1.1"},
{"@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"}
]
},
{
"@id": "./",
"@type": "Dataset",
"conformsTo": [
{"@id": "https://w3id.org/ro/wfrun/process/0.1"},
{"@id": "https://w3id.org/ro/wfrun/workflow/0.1"},
{"@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0"}
],
"hasPart": [
{"@id": "Galaxy-Workflow-Hello_World.ga"},
{"@id": "inputs/abcdef.txt"},
{"@id": "outputs/Select_first_on_data_1_2.txt"},
{"@id": "outputs/tac_on_data_360_1.txt"}
],
"license": {"@id": "http://spdx.org/licenses/CC0-1.0"},
"mainEntity": {"@id": "Galaxy-Workflow-Hello_World.ga"},
"mentions": {"@id": "#wfrun-5a5970ab-4375-444d-9a87-a764a66e3a47"}
},
{ "@id": "https://w3id.org/ro/wfrun/process/0.1",
"@type": "CreativeWork",
"name": "Process Run Crate",
"version": "0.1"
},
{ "@id": "https://w3id.org/ro/wfrun/workflow/0.1",
"@type": "CreativeWork",
"name": "Workflow Run Crate",
"version": "0.1"
},
{ "@id": "https://w3id.org/workflowhub/workflow-ro-crate/1.0",
"@type": "CreativeWork",
"name": "Workflow RO-Crate",
"version": "1.0"
},
{
"@id": "Galaxy-Workflow-Hello_World.ga",
"@type": ["File", "SoftwareSourceCode", "ComputationalWorkflow"],
"name": "Hello World (Galaxy Workflow)",
"author": {"@id": "https://orcid.org/0000-0001-9842-9718"},
"creator": {"@id": "https://orcid.org/0000-0001-9842-9718"},
"programmingLanguage": {"@id": "https://w3id.org/workflowhub/workflow-ro-crate#galaxy"},
"input": [
{"@id": "#simple_input"},
{"@id": "#verbose-param"}
],
"output": [
{"@id": "#reversed"},
{"@id": "#last_lines"}
]
},
{
"@id": "#simple_input",
"@type": "FormalParameter",
"additionalType": "File",
"conformsTo": {"@id": "https://bioschemas.org/profiles/FormalParameter/1.0-RELEASE"},
"description": "A simple set of lines in a text file",
"encodingFormat": [
"text/plain",
{"@id": "http://edamontology.org/format_2330"}
],
"workExample": {"@id": "inputs/abcdef.txt"},
"name": "Simple input",
"valueRequired": "True"
},
{
"@id": "#verbose-param",
"@type": "FormalParameter",
"additionalType": "Boolean",
"conformsTo": {"@id": "https://bioschemas.org/profiles/FormalParameter/1.0-RELEASE"},
"description": "Increase logging output",
"workExample": {"@id": "#verbose-pv"},
"name": "verbose",
"valueRequired": "False"
},
{
"@id": "#reversed",
"@type": "FormalParameter",
"additionalType": "File",
"conformsTo": {"@id": "https://bioschemas.org/profiles/FormalParameter/1.0-RELEASE"},
"description": "All the lines, reversed",
"encodingFormat": [
"text/plain",
{"@id": "http://edamontology.org/format_2330"}
],
"name": "Reversed lines",
"workExample": {"@id": "outputs/tac_on_data_360_1.txt"}
},
{
"@id": "#last_lines",
"@type": "FormalParameter",
"additionalType": "File",
"conformsTo": {"@id": "https://bioschemas.org/profiles/FormalParameter/1.0-RELEASE"},
"description": "The last lines of workflow input are the first lines of the reversed input",
"encodingFormat": [
"text/plain",
{"@id": "http://edamontology.org/format_2330"}
],
"name": "Last lines",
"workExample": {"@id": "outputs/Select_first_on_data_1_2.txt"}
},
{
"@id": "https://orcid.org/0000-0001-9842-9718",
"@type": "Person",
"name": "Stian Soiland-Reyes"
},
{
"@id": "https://w3id.org/workflowhub/workflow-ro-crate#galaxy",
"@type": "ComputerLanguage",
"identifier": "https://galaxyproject.org/",
"name": "Galaxy",
"url": "https://galaxyproject.org/"
},
{
"@id": "#wfrun-5a5970ab-4375-444d-9a87-a764a66e3a47",
"@type": "CreateAction",
"name": "Galaxy workflow run 5a5970ab-4375-444d-9a87-a764a66e3a47",
"endTime": "2018-09-19T17:01:07+10:00",
"instrument": {"@id": "Galaxy-Workflow-Hello_World.ga"},
"subjectOf": {"@id": "https://usegalaxy.eu/u/5dbf7f05329e49c98b31243b5f35045c/p/invocation-report-a3a1d27edb703e5c"},
"object": [
{"@id": "inputs/abcdef.txt"},
{"@id": "#verbose-pv"}
],
"result": [
{"@id": "outputs/Select_first_on_data_1_2.txt"},
{"@id": "outputs/tac_on_data_360_1.txt"}
]
},
{
"@id": "inputs/abcdef.txt",
"@type": "File",
"description": "Example input, a simple text file",
"encodingFormat": "text/plain",
"exampleOfWork": {"@id": "#simple_input"}
},
{
"@id": "#verbose-pv",
"@type": "PropertyValue",
"exampleOfWork": {"@id": "#verbose-param"},
"name": "verbose",
"value": "True"
},
{
"@id": "outputs/Select_first_on_data_1_2.txt",
"@type": "File",
"name": "Select_first_on_data_1_2 (output)",
"description": "Example output of the last (aka first of reversed) lines",
"encodingFormat": "text/plain",
"exampleOfWork": {"@id": "#last_lines"}
},
{
"@id": "outputs/tac_on_data_360_1.txt",
"@type": "File",
"name": "tac_on_data_360_1 (output)",
"description": "Example output of the reversed lines",
"encodingFormat": "text/plain",
"exampleOfWork": {"@id": "#reversed"}
},
{
"@id": "https://usegalaxy.eu/u/5dbf7f05329e49c98b31243b5f35045c/p/invocation-report-a3a1d27edb703e5c",
"@type": "CreativeWork",
"encodingFormat": "text/html",
"datePublished": "2021-11-18T02:02:00Z",
"name": "Workflow Execution Summary of Hello World"
}
]
}
Adding engine-specific traces
Some engines are able to generate contextual information about workflow runs in the form of logs, reports, etc. These are not workflow outputs, but rather additional files automatically generated by the engine, either by default or when activated via a configuration parameter or command line flag. It is RECOMMENDED to add any such files to the RO-Crate; the corresponding entities SHOULD refer to the relevant Action
instance via about:
{
"@id": "#action-1",
"@type": "CreateAction",
...
},
{
"@id": "trace-20230120-40360336.txt",
"@type": "File",
"name": "Nextflow trace for action-1",
"conformsTo": "https://www.nextflow.io/docs/latest/tracing.html#trace-report",
"encodingFormat": "text/tab-separated-values",
"about": "#action-1"
},
{
"@id": "https://www.nextflow.io/docs/latest/tracing.html#trace-report",
"@type": "CreativeWork",
"name": "Nextflow trace report CSV profile"
}
Requirements
This profile inherits the requirements of Process Run Crate and Workflow RO-Crate. In particular, the entity acting as the instrument
of the CreateAction
MUST be the main workflow. This and other additional specifications are listed below.
Property | Required? | Description |
Dataset (the root data entity, e.g. "@id": "./" ) |
||
---|---|---|
conformsTo | MUST | Array MUST reference a CreativeWork entity with an @id URI that is consistent with the versioned Permalink of this document, and SHOULD also reference versioned permalinks for Process Run Crate and Workflow RO-Crate. |
CreateAction | ||
instrument | MUST | Identifier of the main workflow, as specified in Workflow RO-Crate. |
FormalParameter | ||
workExample | MAY | Identifier of the data entity or PropertyValue instance that realizes this parameter. The data entity or PropertyValue instance SHOULD refer to this parameter via exampleOfWork. |
additionalType | MUST | SHOULD include: File , Dataset or Collection if it maps to a file, directory or multi-file dataset, respectively; PropertyValue if it maps to a dictionary-like structured value (e.g. a CWL record); DataType or one of its subtypes (e.g. Integer) if it maps to a non-structured value. A more specific type MAY be used instead of File when appropriate (see MediaObject subtypes), e.g. ImageObject. Note that multiple types can apply, e.g. ["File", "http://edamontology.org/data_3671"] . |