Updated guide is now available here
This guide (
V20161020) was prepared by
Grigori Fursin
and
Bruce Childers with contributions from
Michael Heroux,
Michela Taufer
and other
colleagues
to help you describe and submit your artifacts for evaluation across a range of CS conferences
and journals.
It gradually evolves based on our long-term vision
(
TRUST'14@PLDI'14
and
DATE'16)
and your feedback after our
past Artifact Evaluations
(see
AE CGO-PPoPP'17 discussions).
It should also help you prepare your artifacts
for a possible public release, if you plan to do so
(for example as an auxiliary material in a Digital
Library or on your personal web page).
Navigation:
We aim to make artifact submission as simple as possible.
You just need to pack your artifact (code and data) using any publicly available tool
you prefer. In some exceptional cases when rare hardware or proprietary software is used,
you can arrange a remote access to machine with the pre-installed software.
Then, you need to prepare a small and informal Artifact Evaluation appendix
using our AE LaTeX template (common for PPoPP, CGO and PACT conferences - see below)
to explain evaluators what your artifacts are and how to use them
(you will be allowed to add up to 2 pages of this Appendix to your final camera-ready paper).
Please see this PPoPP'16 paper
for the example of such AE appendix.
At least two reviewers will follow your guide to replicate your results (for example, exact output match)
or reproduce them (for example, varying performance numbers or scalability
on a different machine), and will then send you a report with the
following overall assessment of your artifact based
on our reviewing guidelines:
-
significantly exceeded expectations
-
exceeded expectations
-
met expectations
-
fell below expectations
-
significantly fell below expectations
where
"met expectations" score or above means that your artifact
successfully passed evaluation and will receive a stamp of approval
(added to the paper itself):
The highest ranked artifact (usually not only
reproducible but also customizable and reusable) will
also receive a
"distinguished artifact" award.
This section is also used as a discussion forum with the community
about how to improve AE.
Since our eventual goal is to promote artifact
validation and sharing (rather than naming and shaming
problematic artifacts), you will be able to address
raised issues during the rebuttal.
Furthermore, we allow a small amount of communication
between reviewers and authors whenever there are
installation/usage problems.
In such cases, AE chairs will serve as a proxy to avoid
revealing reviewers' identity (the review is blind,
i.e. your identity is known to reviewers since your paper
is already accepted, but not vice versa).
You just need to perform the following 4 steps to submit your artifact:
-
Pack your artifact (code and data) or provide an easy access to them
using any publicly available and free tool you prefer or strictly require.
For example, you can use the following:
-
Virtual Box to pack all code and data including OS
(typical images are around 2..3GB. we strongly recommend to avoid images larger than 10GB).
-
Docker to pack only touched code and data during experiment.
-
Standard zip or tar with all related code and data, particularly when artifact
should be rebuilt on a reviewers machine (for example to have a non-virtualized access to a specific hardware).
-
Private or public GIT or SVN.
-
Arrange a remote access to machine with pre-installed software
(exceptional cases when rare hardware or proprietary software is used or the VM image is too large))
- you will need to privately send the access information to the AE chairs. Also, please avoid making any changes
to the remote machine during evaluation (unless explicitly agreed with AE chairs) - you can do it during
rebuttal phase, if needed!
-
Check other tools
which can be useful for artifact and workflow sharing.
From our past Artifact Evaluation experience,
we have noticed that the most challenging part is to automate and customize
experimental workflows. It is even worse, if you need
to validate experiments using latest software environment
and hardware (rather than quickly outdated VM and Docker
images). Most of the time, some ad-hoc scripts are used
to implement these workflows. They are very difficult to change
and customize, particularly when an evaluator would like
to try other compilers, libraries and data sets.
These problems motivated us to develop Collective Knowledge Framework (CK) -
a small, portable and open-source infrastructure and repository to help researchers quickly prototype and share their experimental
workflows with all related artifacts as reusable Python components with a unified JSON API and JSON meta description.
CK supports Linux, MacOS, Windows, Android and reduces the burden of researchers and evaluators by automatically detecting
and resolving all required software dependencies across diverse hardware,
unifying autotuning, statistical analysis and predictive analytics (via scikit-learn, R, etc),
and enabling interactive reports.
Please check out how ARM uses CK to crowdsource benchmarking of real workloads,
General Motors to collaboratively optimize Caffe framework,
and Imperial College (London) to crowdsource compiler bug detection
(see PLDI'15 artifacts in the CK format).
If you would like to share your artifacts in the reusable CK format, please check
Getting Started Guide,
CK portable workflows
and list of already shared CK plugins.
-
Write a brief artifact abstract to informally describe your artifact including minimal
hardware and software requirements, how it supports your paper, how it can be validated and
what the expected result is. It will be used to select appropriate reviewers.
-
Fill in and append AE template (download here) to the PDF of your accepted paper.
Though it should be relatively intuitive, you can check out extra notes about this template based on our past AE experience.
-
Submit artifact abstract and new PDF at the AE EasyChair website.
If you encounter problems, find some ambiguities or have any questions,
do not hesitate to contact AE chairs of your conference
or AE steering committee!
You can now add the following stamp to the final camera-ready version of your paper:
While there are no strict formatting rules for the stamp,
please add it anywhere close to the title. For example,
see
PPoPP'15 article
together with this
LaTeX example.
You can change \hspace and \raisebox parameters to better fit stamp to your paper.
We strongly encourage you to submit your AE appendix (up to 2 pages)
as an auxiliary material for the Digital Library
(while removing all unnecessary or confidential information)
along with the final variant of your paper. This will help
readers better understand what was evaluated.
Though you are not obliged to publicly release your artifacts
(in fact, it is sometimes impossible due to various limitations),
we also strongly encourage you to share them with the community
(even if they are not open-source).
You can release them as an auxiliary material in Digital Libraries
together with your AE appendix or use your institutional repository
and various public services for code and data sharing.
Even accepted artifacts may have some unforeseen behavior and limitations
discovered during evaluation. Now you have a chance to add related notes
to your paper as a future work (if you wish)..
-
"Gunrock: A High-Performance Graph Processing Library on the GPU", PPoPP 2016 (PDF with AE appendix and GitHub)
-
"Integrating algorithmic parameters into benchmarking and design space exploration in dense 3D scene understanding", PACT 2016 (example of interactive graphs and artifacts in the Collective Knowledge format)
-
"GEMMbench: a framework for reproducible and collaborative benchmarking of matrix multiplication", ADAPT 2016 (example of a CK-powered artifact reviewed and validated by the community via Reddit)
-
"Polymer: A NUMA-aware Graph-structured Analytics Framework", PPoPP 2015 (GitHub and personal web page)
-
"A graph-based higher-order intermediate representation", CGO 2015 (GitHub)
-
"MemorySanitizer: fast detector of uninitialized memory use in C++", CGO 2015 (added to LLVM)
-
"Predicate RCU: an RCU for scalable concurrent updates", PPoPP 2015 (BitBucket)
-
"Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics", PPoPP 2015 (SourceForge and Jikes RVM)
-
"More than You Ever Wanted to Know about Synchronization", PPoPP 2015 (GitHub)
-
"Roofline-aware DVFS for GPUs", ADAPT 2014 (ACM DL, Collective Knowledge repository)
-
"Many-Core Compiler Fuzzing", PLDI 2015 (example of an artifact with a CK-based experimemtal workflow and live results)