Peer Review of RO-6

Author: Timothy McPhillips, Bertram Ludäscher
Title: Reproducibility by Other Means: Transparent Research Objects
Submission: https://doi.org/10.5281/zenodo.3270620
- Camera-ready: https://doi.org/10.5281/zenodo.3382423
Submitted as: RO2019 article
Decision: ACCEPT paper (accepted for proceeding and oral presentation)

Review 1

Reviewer: José Manuel Gómez Pérez https://orcid.org/0000-0002-5491-6431

Quality of Writing

Is the text easy to follow? Are core concepts defined or referenced? Is it clear what is the author’s contribution?

good

Research Object / Zenodo

URL for a Research Object or Zenodo record provided? Guidelines followed? Open format (e.g. HTML)? Sufficient metadata, e.g. links to software? Some form of Data Package provided? Add text below if you need to clarify your score.

none (e.g. only abstract in easychair/github)

Overall evaluation

Please provide a brief review, including a justification for your scores. Both score and review text are required.

This paper explores several key concepts in the area of research objects and their application to the scientific enterprise, namely reproducibility, replicability and transparency. The authors raise once again the different understanding of these terms by the different communities and scientific agencies and how this can lead to conflicts when trying to apply these concepts in science in general and while leveraging computational means to support science. Particularly interesting is the terminological clash around replicability and reproducibility illustrated by the recommendations issued e.g. by the FASEB and NAS. This is useful specially for the interested reader who has not participated in these discussions yet.

According to the paper, many of the properties of research objects already can be seen as supporting scientific transparency (understood as the ability to evaluate scientific work without actually having to repeat the steps taken in the reported research). In this regard, the authors claim that “it is possible to achieve computational repeatability without providing research transparency and vice versa. Moreover, exact repeatability is not an essential element of scientific reproducibility in the broadest sense of the term while transparency arguably is.”. The key contribution of this paper is to outline ways in which research object could support such notion of transparency, rather than exact computational repeatability, hence the first part of the paper title (Reproducibility by Other Means).

The paper makes for an interesting read, and includes several measures that are proposed to achieve the above-mentioned goals:

The first measure proposes to address the terminology clash between terms “reproducibility” and “replicability” by adding explicit namespaced metadata inside the research object that allows the author to declare in what sense their research object complies with such properties, e.g. in the sense of NAS (NAS::reproducible), FASEB or others.
The second one is by adding additional statements on the different aspects related to computational reproducibility (e.g. Dockerfile pointing to the right image, software packages available in Ubuntu Apt, etc.) so that everyone is aware of the implications for the possibilities of rerunning, reproducing, or exactly repeating the computations described in the research object under different conditions.
The third one is by enabling users who are probably unversed in the research object and PROV specifications to run provenance queries that also observe the previous annotations, hence allowing queries that check for unambiguous declarations of reproducibility or replicability.

All such proposals seem to have been made in the understanding that it is impossible to achieve a perfect solution to these challenges that will remain now and in the future. On the contrary, the bet is made on flexible and expandable means to incorporate the new needs as they appear, supporting change, e.g. an eventual future in which Docker is superseded by another option as the hegemonic containerization solution.

In summary, this paper first emphasizes the differences between (the different understandings of) scientific reproducibility and computational reproducibility. Then it raises interesting questions related to how we should consider the reproducibility challenge in order to endorse reuse of third party scientific work with increasing confidence. Finally, it provides a vision of how research objects can help to address such challenges, giving hints of requirements that research objects would need to fulfill in that direction.

I think all these points will raise interesting discussions at the workshop which hopefully will contribute to consensuate common strategies in this regard and their corresponding action plans by the research object community.

Review 2

Reviewer: (anonymous)

Quality of Writing

Is the text easy to follow? Are core concepts defined or referenced? Is it clear what is the author’s contribution?

(delete as appropriate)

excellent

Research Object / Zenodo

basic (e.g. Zenodo with PDF and minimal metadata)

Overall evaluation

Please provide a brief review, including a justification for your scores. Both score and review text are required.

weak accept

This article advocates for a focus on transparency in the field of reproducibility, leveraging Research Objects (ROs) to allow alternate vocabularies for terms as well as support the various requirements scientists have for the reproducibility of their work. The article begins by describing the differences in how scientists view reproducibility in different contexts, and suggests exact repeatability as a new area enabled by computational resources. Using the recent FASEB and NAS recommendations and definitions, the article points out specific differences between the use of the terms replicability and reproducibility. Beyond using mediation to help those with different definitions understand each other, the article suggests that tools that create ROs should help users both satisfy and probe the reproducibility of those objects. Finally, the article notes the importance of querying provenance contained in an RO, and suggests such queries should be user-friendly.

I like the idea that tools help inform users about how an RO supports reproducibility, and the limits it may have. Something that warns a user about having a Dockerfile that references a base image without a version seems very useful.
I think the idea of making understandable what is required to reproduce work is important, even if a researcher is not going to re-execute that work.
Is there a good citation for the definition of “transparent”? It may be helpful in the definition in Section 3.
The NAS report, leaning on Barba’s study of the terms, goes into detail about the history of the use of the terms, and describes how different communities (including FASEB) have used the term (pp. 33-36). It also discusses the importance of transparency in line with this article, and states that computational scientists generally define reproducibility as whether the the research “can be checked” (as noted in point 3 in the article). The article spends a decent amount of text exploring the discrepancies between the FASEB and NAS definitions, seeming to argue for more deference to the FASEB perspective. I would suggest more focus on the RO solutions in the text.

From my perspective, the term reproducibility seems to have been historically associated with the computational context so I understand why the NAS report continues that association. I understand that science has long been interested in replicated experiments, but I am unaware of a specific historical definition of reproducibility outside of the computational context. I understand the issues when communities use terms in different ways, and namespaces do offer a technical solution to this problem. However, it’s not clear that keeping both definitions aids in human understanding and communication.
I disagree with the assertion that transparency and re-executability are orthogonal. They can be (e.g. detailed provenance vs. black-box execution), but I would argue that they are more likely aligned (a script that specifies inputs/outputs and intermediate steps can aid both).
It looks like reference [26] is missing some information.
I would recommend that Section 2 be combined into Section 1.

In general, I think this is a well-written article that presents some interesting ways reproducibility could be more integrated with Research Objects. However, I think the article would be improved by focusing less on the differences in definitions and more on ideas that seek to promote transparent reproducibility in ROs (e.g. probing existing ROs for potential reproducibility issues).