Reproducibility

Enabling collaborative and reproducible computer systems research with an open publication model

This wiki is maintained by the non-profit Tuning foundation.

News and upcoming events

We have released our new, open-source, BSD-licensed Collective Knowledge Framework (cTuning 4 aka CK) for collaborative and reproducible R&D: GitHub with an online documentation and live demo repository.
PPoPP'16 artifact evaluation
CGO'16 artifact evaluation
ADAPT'16 @ HiPEAC'16 - features our open publication model with community-driven reviewing, reddit-based discussions and artifact evaluation
Dagstuhl perspective workshop on artifact evaluation for conferences and journals

Motivation

Since 2006 we have been trying to solve problems with reproducibility of experimental results in computer engineering as a side effect of our MILEPOST , cTuning.org, Collective Mind and Collective Knowledge projects (speeding up optimization, benchmarking and co-design of computer systems using auto-tuning, big data, predictive analytics and crowdsourcing). We focus on the following technological and social aspects to enable collaborative, systematic and reproducible research and experimentation particularly related to benchmarking, optimization and co-design of faster, smaller, cheaper, more power efficient and reliable software and hardware:

developing public and open source Collective Mind repositories of knowledge (see our pilot live repository [CK, cMind] and our vision papers [1,2]);
developing collaborative research and experimentation infrastructure that can share artifacts as reusable components together with the whole experimental setups (see our papers [1,2];
evangelizing and enabling new open publication model for online workshops, conferences and journals (see our proposal [arXiv , ACM DL]);
setting up and improving procedure for sharing and evaluating experimental results and all related material for workshops, conferences and journals (see our proposal [arXiv , ACM DL]);
improving sharing, description of dependencies, and statistical reproducibility of experimental results and related material.

See our manifesto and history here.

Our R&D

Together with the community and not-for-profit cTuning foundation we are working on the following topics:

developing tools and methodology to capture, preserve, formalize, systematize, exchange and improve knowledge and experimental results including negative ones
describing and cataloging whole experimental setups with all related material including algorithms, benchmarks, codelets, datasets, tools, models and any other artifact
developing specification to preserve experiments including all software and hardware dependencies
dealing with variability and rising amount of experimental data using statistical analysis, data mining, predictive modeling and other techniques
developing new predictive analytics techniques to explore large design and optimization spaces
validating and verifying experimental results by the community
developing common research interfaces for existing or new tools
developing common experimental frameworks and repositories (enable automation, re-execution and sharing of experiments)
sharing rare hardware and computational resources for experimental validation
implementing previously published experimental scenarios (auto-tuning, run-time adaptation) using common infrastructure
implementing open access to publications and data (particularly discussing intellectual property IP and legal issues)
speeding up analysis of "big" experimental data
developing new (interactive) visualization techniques for "big" experimental data
enabling interactive articles

Our events

PPoPP'15 artifact evaluation
CGO'15 artifact evaluation
ADAPT'15 @ HiPEAC'15 - workshop on adaptive self-tuning computer systems
ADAPT'14 @ HiPEAC'14 - workshop on adaptive self-tuning computer systems [ program and publications ]
Special journal issue on Reproducible Research Methodologies at IEEE TETC
ACM SIGPLAN TRUST'14 @ PLDI'14
REPRODUCE'14 @ HPCA'14
ADAPT'14 panel @ HiPEAC'14
HiPEAC'13 CSW thematic session @ ACM ECRC "Making computer engineering a science"
HiPEAC'12 CSW thematic session
ASPLOS/EXADAPT'12 panel @ ASPLOS'12
cTuning lectures (2008-2010)
GCC Summit'09 discussion

Paper and artifact evaluation committee

Rather than pre-selecting a dedicated committee for conferences, we select reviewers for reseach material (artifacts) and publications from a pool of our supporters based on submitted and publicly available publications, their keywords and public discussions as described in our proposal [arXiv], [ACM DL]. Validated papers receive a stamp "Validated by the community". Artifacts can be shared along with publication in the ACM Digital LIbrary, HAL, Collective Mind Repository or any other public archive.

As for the workshops, conferences and journals with the traditional publication model (CGO, PPoPP, PLDI), we select artifact evaluation committee (AEC) as described here.

Packing and sharing research and experimental material

Rather than enforcing specific procedure for packing, sharing and validation of experimental results, we allow authors of the accepted papers to include an archive with all related research material (using any publicly available tool) and readme.txt file describing how to validate their experiments. The main reason is the lack of a universally acceptable solution to pack and share experimental setups. For example, it is not always possible to use Virtual Machines and similar approaches for our research on performance/energy tuning or when some new hardware is being co-designed as we discuss in our proposal [arXiv, ACM DL]. Therefore, our current intention is to gradually and collaboratively find best procedure for packing using practical experience from our events such as ADAPT workshop and from common discussions during ACM SIGPLAN TRUST'14 workshops. See also nice guidelines for packing code and data along with publications here.

Validation

After many years of evangelizing collaborative and reproducible research in computer engineering based on our practical experience, we finally start seeing some change in the mentality in academia, industry and funding agencies. Authors of two papers of our ADAPT'14 workshop (out of nine accepted) agreed to have experimental results of their papers validated by volunteers. Note that rather than enforcing specific validation rules, we decided to ask authors to pack all their research artifacts as they wish (for example, using a shared virtual machine or as a standard archive) and describe their own validation procedure. Thanks to our volunteers, experiments from these papers have been validated, archives shared in our public repository , and papers marked with a "validated by the community" stamp:

Resources

Discussions

LinkedIn group on reproducible research
Main mailing list (general collaborative and reproducible R&D in computer engineering)
cTuning foundation mailing list (collaborative and reproducible hardware and software benchmarking, auto-tuning and co-design)

Acknowledgments

We would like to thank our colleagues from the cTuning foundation, dividiti, artifact-eval.org, OCCAM project for their help, feedback, participation and support.

Contents