Data entities
Overview
Teaching: 2 min
Exercises: 2 minQuestions
How do I describe the files in my RO-Crate?
Objectives
Understand the purpose of data entities
Learn required properties for data entities
Data entities
A main type of resources collected in a Research Object is data – simplifying, we can consider data as any kind of file that can be opened in other programs. These are aggregated by the Root Dataset with the hasPart
property. In this example we have an array with a single value, a reference to the entity describing the file data.csv
.
Referencing external resources
RO-Crates can also contain data entities that are folders and Web resources, as well as non-File data like online databases – see section on data entities.
We should now be able to follow the @id
reference for the corresponding data entity JSON block for our CSV file, which we need to add to the @graph
of the RO-Crate Metadata Document.
Add a data entity
- Add a declaration for the CSV file as new entity with
@type
declared asFile
.- Give the file a human-readable
name
anddescription
to detail it as Rainfall data for Katoomba in NSW Australia, captured February 2022.- To add this is a CSV file, declare the
encodingFormat
as the appropriate IANA media type string.Solution
{ "@id": "data.csv", "@type": "File", "name": "Rainfall Katoomba 2022-02", "description": "Rainfall data for Katoomba, NSW Australia February 2022", "encodingFormat": "text/csv" },
It is recommended that every entity has a human-readable name
; as shown in the above example, this does not need to match the filename/identifier. The encodingFormat
indicates the media file type so that consumers of the crate can open data.csv
in an appropriate program, and can be particularly important for less common file extensions frequently encounted in outputs from research software and instruments.
For more information on describing files and folders, including their recommended and required attributes, see section on data entities.
Override the licence
- Consider if the file content of
data.csv
is not covered by our overall license (CC0), but Creative Commons BY-NC-SA 4.0 (which only permits non-commercial use)- To override, add an
license
cross-reference property on this particular data entitySolution
{ "@id": "data.csv", "@type": "File", "name": "Rainfall Katoomba 2022-02", "description": "Rainfall data for Katoomba, NSW Australia February 2022", "encodingFormat": "text/csv", "license": { "@id": "https://creativecommons.org/licenses/by-nc-sa/4.0/" } },
Key Points
Data entities are files & folders within the root, as well as external Web references
Required properties for files are name and encodingFormat
License can be overridden for particular data entities