ro2019

Logo

Workshop on Research Objects 2019

View the Project on GitHub ResearchObject/ro2019

Peer Review of RO-6

Review 1

Quality of Writing

Is the text easy to follow? Are core concepts defined or referenced? Is it clear what is the author’s contribution?

Research Object / Zenodo

URL for a Research Object or Zenodo record provided?   Guidelines followed?   Open format (e.g. HTML)?   Sufficient metadata, e.g. links to software?   Some form of Data Package provided?   Add text below if you need to clarify your score.

Overall evaluation

Please provide a brief review, including a justification for your scores. Both score and review text are required.

This paper explores several key concepts in the area of research objects and their application to the scientific enterprise, namely reproducibility, replicability and transparency. The authors raise once again the different understanding of these terms by the different communities and scientific agencies and how this can lead to conflicts when trying to apply these concepts in science in general and while leveraging computational means to support science. Particularly interesting is the terminological clash around replicability and reproducibility illustrated by the recommendations issued e.g. by the FASEB and NAS. This is useful specially for the interested reader who has not participated in these discussions yet.

According to the paper, many of the properties of research objects already can be seen as supporting scientific transparency (understood as the ability to evaluate scientific work without actually having to repeat the steps taken in the reported research). In this regard, the authors claim that “it is possible to achieve computational repeatability without providing research transparency and vice versa. Moreover, exact repeatability is not an essential element of scientific reproducibility in the broadest sense of the term while transparency arguably is.”. The key contribution of this paper is to outline ways in which research object could support such notion of transparency, rather than exact computational repeatability, hence the first part of the paper title (Reproducibility by Other Means).

The paper makes for an interesting read, and includes several measures that are proposed to achieve the above-mentioned goals:

  1. The first measure proposes to address the terminology clash between terms “reproducibility” and “replicability” by adding explicit namespaced metadata inside the research object that allows the author to declare in what sense their research object complies with such properties, e.g. in the sense of NAS (NAS::reproducible), FASEB or others.

  2. The second one is by adding additional statements on the different aspects related to computational reproducibility (e.g. Dockerfile pointing to the right image, software packages available in Ubuntu Apt, etc.) so that everyone is aware of the implications for the possibilities of rerunning, reproducing, or exactly repeating the computations described in the research object under different conditions.

  3. The third one is by enabling users who are probably unversed in the research object and PROV specifications to run provenance queries that also observe the previous annotations, hence allowing queries that check for unambiguous declarations of reproducibility or replicability.

All such proposals seem to have been made in the understanding that it is impossible to achieve a perfect solution to these challenges that will remain now and in the future. On the contrary, the bet is made on flexible and expandable means to incorporate the new needs as they appear, supporting change, e.g. an eventual future in which Docker is superseded by another option as the hegemonic containerization solution.

In summary, this paper first emphasizes the differences between (the different understandings of) scientific reproducibility and computational reproducibility. Then it raises interesting questions related to how we should consider the reproducibility challenge in order to endorse reuse of third party scientific work with increasing confidence. Finally, it provides a vision of how research objects can help to address such challenges, giving hints of requirements that research objects would need to fulfill in that direction.

I think all these points will raise interesting discussions at the workshop which hopefully will contribute to consensuate common strategies in this regard and their corresponding action plans by the research object community.

Review 2

Quality of Writing

Is the text easy to follow? Are core concepts defined or referenced? Is it clear what is the author’s contribution?

(delete as appropriate)

Research Object / Zenodo

URL for a Research Object or Zenodo record provided?   Guidelines followed?   Open format (e.g. HTML)?   Sufficient metadata, e.g. links to software?   Some form of Data Package provided?   Add text below if you need to clarify your score.

Overall evaluation

Please provide a brief review, including a justification for your scores. Both score and review text are required.

This article advocates for a focus on transparency in the field of reproducibility, leveraging Research Objects (ROs) to allow alternate vocabularies for terms as well as support the various requirements scientists have for the reproducibility of their work. The article begins by describing the differences in how scientists view reproducibility in different contexts, and suggests exact repeatability as a new area enabled by computational resources. Using the recent FASEB and NAS recommendations and definitions, the article points out specific differences between the use of the terms replicability and reproducibility. Beyond using mediation to help those with different definitions understand each other, the article suggests that tools that create ROs should help users both satisfy and probe the reproducibility of those objects. Finally, the article notes the importance of querying provenance contained in an RO, and suggests such queries should be user-friendly.

In general, I think this is a well-written article that presents some interesting ways reproducibility could be more integrated with Research Objects. However, I think the article would be improved by focusing less on the differences in definitions and more on ideas that seek to promote transparent reproducibility in ROs (e.g. probing existing ROs for potential reproducibility issues).