This document (V20170414) provides guidelines to review artifacts. It gradually evolves to define common evaluation criteria based on our past Artifact Evaluations, ACM reviewing and badging policy which we co-authored in 2016, and your feedback (2017a, 2017b, 2014).
During rebuttal (technical clarification phase), authors will be able to address raised issues and respond to reviewers. Finally, reviewers will check if raised issues have been fixed and will provide the final report. Based on all reviewers, AE chairs will make the following final assessment of the submitted artifact:
+1) exceeded expectations
0) met expectations (or inapplicable)
-1) fell below expectations
where "met expectations" score or above means that a reviewer managed to evaluate a given artifact possibly with minor problems that a reviewer still managed to solve without authors' assistance. Such artifact passes evaluation and receives a stamp of approval.
Note that our goal is not to fail problematic artifacts but to promote reproducible research via artifact validation and sharing. Therefore, we allow light communication between reviewers and authors whenever there are installation/usage problems. In such cases, AE chairs serve as a proxy to avoid revealing reviewers identity.
|Criteria||Score||Badges for ACM conferences
|Badges for non-ACM conferences
|Artifacts publicly available?||Are all artifacts related to this paper are publicly available?
Note that it is not obligatory to make artifacts publicly available!
The author-created artifacts relevant to this paper
will receive ACM "artifact available" badge
only if they have been placed on
a publically accessible archival repository. A DOI or link to this repository
along with a unique identifier for the object is provided.
Notes: ACM does not mandate the use of specific repositories. Publisher repositories (such as the ACM Digital Library), institutional repositories, or open commercial repositories (e.g., figshare or Dryad) are acceptable. In all cases, repositories used to archive data should have a declared plan to enable permanent accessibility. Personal web pages are not acceptable for this purpose.
Artifacts do not need to have been formally evaluated in order for an article to receive this badge. In addition, they need not be complete in the sense described above. They simply need to be relevant to the study and add value beyond the text in the article. Such artifacts could be something as simple as the data from which the figures are drawn, or as complex as a complete software system under study.
|Artifacts functional?||Package complete?||
All components relevant to evaluation are included to the package?
Note that proprietary artifacts need not be included. If they are required to exercise the package then this should be documented, along with instructions on how to obtain them. Proxies for proprietary data should be included so as to demonstrate the analysis.
The artifacts associated with the paper will receive "Artifacts Evaluated
- Functional" badge only if they are found to be documented, consistent,
complete, exercisable, and include appropriate evidence of verification and
|Well documented?||Enough to understand, install and evaluate artifact?|
|Exercisable?||Includes scripts and/or software to perform appropriate experiments and generate results?|
|Consistent?||Artifacts are relevant to the associated paper and contribute in some inherent way to the generation of its main results?|
|Artifacts customizable and reusable?||
Can this artifact and experimental workflow be easily reused and customized? For example, can it be used on a different platform, with different benchmarks, data sets, compilers, tools, under different conditions and parameters, etc.?
Note that this is optional and used only to select distinguished artifact. For example, we encourage the use of common workflow frameworks with unified APIs and data formats for computer systems research (such as Collective Knowledge workflow framework) and give a special prize for such artifacts.
The artifacts associated with the paper will receive "Artifact Evaluated - Reusable" badge
only if they are of a quality that significantly exceeds minimal functionality.
That is, they have all the qualities of the Artifacts Evaluated - Functional level,
but, in addition, they are very carefully documented and well-structured to the extent
that reuse and repurposing is facilitated. In particular, norms and standards of the research
community for artifacts of this type are strictly adhered to.
Can all main results from the paper be validated using provided artifacts?
Report any unexpected artifact behavior (depends on the type of artifact such as unexpected output, scalability issues, crashes, performance variation, etc).
The artifacts associated with the paper will receive
"Results replicated" badge only if the main results
of the paper have been obtained in a subsequent study
by a person or team other than the authors, using,
in part, artifacts provided by the author.
Note that variation of empirical and numerical results is tolerated. In fact it is often unavoidable in computer systems research - see "how to report and compare empirical results?" in AE FAQ!
Did artifacts and results match authors' description?
Tell AE committee if you would like to nominate this artifact for the distinguished artifact award.
|Artifacts successfully passed evaluation receive a stamp of approval: