Reproducibility

Enabling collaborative, systematic and reproducible research and experimentation with an open publication model in computer engineering

This wiki is maintained by cTuning foundation. If you would like to help or make corrections, please get in touch with Grigori Fursin.

1 Motivation
- 1.1 Community-driven research and developments
2 Our interdisciplinary events
- 2.1 Featuring new open publication model and validation of experimental results
- 2.2 Discussing technical aspects to enable reproducibility and open publication model
3 Reproducible Research Committee
- 3.1 Steering committee
- 3.2 Artifact evaluation / program committee
4 Packing and sharing research and experimental material
5 History and manifesto
6 Validation
7 Archive
8 Links
9 Follow us

Motivation

Since 2006 we are trying to solve problems with reproducibility of experimental results in computer engineering as a side effect of our MILEPOST , cTuning.org and Collective Mind projects (speeding up optimization, benchmarking and co-design of computer systems using auto-tuning, big data, predictive analytics and crowdsourcing). We focus on the following technological and social aspects to enable collaborative, systematic and reproducible research and experimentation particularly related to benchmarking, optimization and co-design of faster, smaller, cheaper, more power efficient and reliable software and hardware:

developing public and open source repositories of knowledge including Collective Mind;*developing collaborative research and experimentation infrastructure that can share the whole experimental setups with all software and hardware dependencies;
evangelizing and enabling new open publication model for online workshops, conferences and journals (see our proposal [arXiv , ACM DL]);
setting up and improving procedure for sharing and evaluating experimental results and all related material for workshops, conferences and journals (see our proposal [arXiv , ACM DL]);
improving sharing, description of dependencies, and statistical reproducibility of experimental results and related material.

See our manifesto and history here.

Community-driven research and developments

Together with the community and cTuning foundation we are working on the following topics:

developing tools and methodology to capture, preserve, formalize, systematize, exchange and improve knowledge and experimental results including negative ones
describing and cataloging whole experimental setups with all related material including algorithms, benchmarks, codelets, datasets, tools, models and any other artifact
developing specification to preserve experiments including all software and hardware dependencies
dealing with variability and rising amount of experimental data using statistical analysis, data mining, predictive modeling and other techniques
developing new predictive analytics techniques to explore large design and optimization spaces
validating and verifying experimental results by the community
developing common research interfaces for existing or new tools
developing common experimental frameworks and repositories (enable automation, re-execution and sharing of experiments)
sharing rare hardware and computational resources for experimental validation
implementing previously published experimental scenarios (auto-tuning, run-time adaptation) using common infrastructure
implementing open access to publications and data (particularly discussing intellectual property IP and legal issues)
speeding up analysis of "big" experimental data
developing new (interactive) visualization techniques for "big" experimental data
enabling interactive articles

Our interdisciplinary events

Featuring new open publication model and validation of experimental results

ADAPT'15 - workshop on adaptive self-tuning computer systems. It is currently under submission and will likely be co-located with HiPEAC'15.
ADAPT'14 - workshop on adaptive self-tuning computer systems [ program and publications ]

Discussing technical aspects to enable reproducibility and open publication model

Special journal issue on Reproducible Research Methodologies at IEEE TETC
ACM SIGPLAN TRUST'14 @ PLDI'14
REPRODUCE'14 @ HPCA'14
ADAPT'14 panel @ HiPEAC'14
HiPEAC'13 CSW thematic session @ ACM ECRC "Making computer engineering a science"
HiPEAC'12 CSW thematic session
ASPLOS/EXADAPT'12 panel @ ASPLOS'12
cTuning lectures (2008-2010)
GCC Summit'09 discussion

Reproducible Research Committee

Steering committee

Grigori Fursin, cTuning foundation and INRIA, France (evangelist of a collaborative and reproducible research and experimentation in computer engineering)
Cristophe Dubach, University of Edinburgh, UK (co-organizer of the ADAPT workshop)
Our colleagues and collaborators from AEC
Our colleagues and collaborators from OCCAM project

Artifact evaluation / program committee

Rather than pre-selecting a dedicated committee for conferences, we select reviewers for reseach material (artifacts) and publications from a pool of our supporters based on submitted publications and their keywords as discussed in our vision paper on new publication model [arXiv], [ACM DL].

Packing and sharing research and experimental material

Rather than enforcing specific procedure for packing, sharing and validation of experimental results, we allow authors of the accepted papers to include an archive with all related research material (using any publicly available tool) and readme.txt file describing how to validate their experiments. The main reason is the lack of a universally acceptable solution to pack and share experimental setups. For example, it is not always possible to use Virtual Machines and similar approaches for our research on performance/energy tuning or when some new hardware is being co-designed as we discuss in our proposal [arXiv, ACM DL]. Therefore, our current intention is to gradually and collaboratively find best procedure for packing using practical experience from our events such as ADAPT workshop and from common discussions during ACM SIGPLAN TRUST'14 workshops.

History and manifesto

In the MILEPOST project we attempted to build a practical machine learning based self-tuning compiler combining plugin-based auto-tuning framework with a public cTuning repository of knowledge, crowdsourcing predictive analytics, but faced numerous problems including:

Lack of common, large and diverse benchmarks and data sets needed to build statistically meaningful predictive models;
Lack of common experimental methodology and unified ways to preserve, systematize and share our growing optimization knowledge and research material including benchmarks, data sets, tools, tuning plugins, predictive models and optimization results;
Problem with continuously changing, "black box" and complex software and hardware stack with many hardwired and hidden optimization choices and heuristics not well suited for auto-tuning and machine learning;
Difficulty to reproduce performance results from the cTuning.org database submitted by users due to a lack of full software and hardware dependencies;
Difficulty to validate related auto-tuning and machine learning techniques from existing publications due to a lack of culture of sharing research artifacts with full experiment specifications along with publications in computer engineering.

Our new proposal to crowdsource reviewing of publications and artifacts

Validation

After many years of evangelizing collaborative and reproducible research in computer engineering based on our practical experience, we finally start seeing some change in the mentality in academia, industry and funding agencies. Authors of two papers of our ADAPT'14 workshop (out of nine accepted) agreed to have experimenta results of their papers validated by volunteers. Note that rather than enforcing specific validation rules, we decided to ask authors to pack all their research artifacts as they wish (for example, using a shared virtual machine or as a standard archive) and describe their own validation procedure. Thanks to our volunteers, experiments from these papers have been validated, archives shared in our public repository , and papers marked with a "validated by the community" stamp:

Contents