From cTuning.org
(New page: '''Notes from the thematic session meeting on "making computer engineering a science" held at HiPEAC computing week/ACM ECRC in Paris on 2nd and 3rd of May, 2013''' '''TBD''') |
Current revision (06:41, 19 May 2013) (view source) |
||
(4 intermediate revisions not shown.) | |||
Line 1: | Line 1: | ||
'''Notes from the thematic session meeting on "making computer engineering a science" held at HiPEAC computing week/ACM ECRC in Paris on 2nd and 3rd of May, 2013''' | '''Notes from the thematic session meeting on "making computer engineering a science" held at HiPEAC computing week/ACM ECRC in Paris on 2nd and 3rd of May, 2013''' | ||
- | '' | + | = Summary of discussions = |
+ | |||
+ | Many thanks to Bruce Childers, Alex Jones, Vittorio Zaccaria, Christian Bertin, Christophe Guillon, Christoph Reichenbach, Markus Puschel, Olivier Zendra, Daniel Gracia Perez, Hans Vandierendonck, Gert Jervan, Zbigniew Chamski, and all others for active participation in discussions. | ||
+ | |||
+ | 1) Bruce Childers and Alex Jones presented long term OCCAM initiative recently started in the USA for open Curation for Computer Architecture Modeling: | ||
+ | * http://www.hipeac.net/system/files/childers_occam_open_curation_for_computer_architecture_modeling.pdf | ||
+ | * http://csa.cs.pitt.edu | ||
+ | |||
+ | The main ambitious goal of this effort is to build common collaborative infrastructure and repository for reproducible research and development in computer architecture modeling. | ||
+ | |||
+ | Current focus is to : | ||
+ | * determine governance structures | ||
+ | * develop evaluation methodology | ||
+ | * build a pilot repository | ||
+ | * demonstrate to community the value of an open repository | ||
+ | |||
+ | Low-volume mailing list is available (contact Bruce for more details) | ||
+ | |||
+ | Main concerns: | ||
+ | * infrastructure and repository should be developed by professional engineers with a long-term support! | ||
+ | * fair governing of this initiative (policies and procedure issues) | ||
+ | * persuading community to use such repository and share results will not be easy (need to show the value) | ||
+ | |||
+ | Related initiatives: nanoHUB (Nanotechnology), arXiv (physics and other sciences), EarthCub (Geosciences) | ||
+ | |||
+ | 2) Vittorio Zaccaria discussed: | ||
+ | * possibility to speed up peer reviewed publishing using social networks and ranking (kind of reddit) as a possible filter that later used by endorsers and peer reviewers. Of course, there are many potential issues (trust, security, flaming, group interests, lobbying, etc) but it will be interesting to discuss it further since it can be a nice community instrument to help reviewers "focus" their effort and provide a few but high quality reviews. | ||
+ | * need to validate models against common datasets and also have a possibility to present negative results | ||
+ | |||
+ | Related initiatives: Kaggle, (GIGA)n Science, figshare | ||
+ | |||
+ | * http://www.hipeac.net/system/files/zaccaria_towards_european_research_v2.pdf | ||
+ | * http://www.vittoriozaccaria.net/data/20130502_hipeac_paris | ||
+ | |||
+ | 3) Christian Bertin and Christophe.Guillon presented CARE tool being developed at STMicroelectronics (Comprehensive Archiving for Reproducible Execution) that tackles the following: | ||
+ | |||
+ | * difficulty to reproduce bug reports | ||
+ | * difficulty to reproduce someone's else experiment (missing of different architecture, host system, libraries, compilers, etc) | ||
+ | |||
+ | The tool relies on PRoot (http://proot.me) and provides 2 solutions: | ||
+ | * transparent archiving of only necessary files to reproduce experiment (space savings of up to 100 times in comparison with archiving the whole disk or virtual machine state) | ||
+ | * transparent sandboxing to reproduce execution | ||
+ | |||
+ | The tool is in a working/testing stage and there is a hope that it will be released as open source to the community. It can be a useful instrument for submitting/reproducing experimental setups. | ||
+ | |||
+ | * http://www.hipeac.net/system/files/bertin_care_comprehensive_archiving_for_reproducible_execution.pdf | ||
+ | |||
+ | 4) Christoph Reichenbach presented STEP project - Software Tools Evaluation Platform to enable continuous and systematic tools reviews by the community. | ||
+ | |||
+ | It can help to: | ||
+ | * reward practical approaches and guide further developments | ||
+ | * find useful and robust tools that can be used in reproducible research | ||
+ | * validate usage scenarios not described in the paper | ||
+ | * identify tools that deserve long-term support or commercialization | ||
+ | |||
+ | Current discussions are about: | ||
+ | * reviewing criteria / minimization of workload | ||
+ | * motivation for the community to evaluate others tools | ||
+ | * avoiding fake meta-reviews | ||
+ | * collaborating with DataMill project: http://datamill.uwaterloo.ca | ||
+ | * automation | ||
+ | |||
+ | It's a recent on-going project - feedback, platform contributors and potential reviewers and tool authors are welcome: | ||
+ | * http://www-staff.informatik.uni-frankfurt.de/~creichen/step.en.html | ||
+ | * http://www.hipeac.net/system/files/reichenbach_towards_continuous_evaluation_of_software_tools.pdf | ||
+ | |||
+ | 5) I presented Collective Mind Project: open source, plugin-based repository and infrastructure for collaborative and reproducible R&D in design, optimization and run-time adaptation of computer systems. | ||
+ | |||
+ | * http://www.hipeac.net/system/files/fursin_collective_mind_framework_and_repository.pdf | ||
+ | |||
+ | It allows researchers and developers to: | ||
+ | * gradually decompose complex system into plugins with unified and exposed tuning choices, properties and characteristics at multiple granularity levels in existing systems (compiler flags, CUDA/OpenCL/OpenMP pragmas, fine-grain optimizations, etc) | ||
+ | * share any research artifact with a UID such as benchmarks, codelets, data sets, tools, predictive models, auto-tuning and run-time adaptation strategies | ||
+ | * cross-link external public or end-user repositories | ||
+ | * prepare any research scenario through customized and reproducible experimental pipelines | ||
+ | * evaluate various machine learning and data mining techniques on public experimental data | ||
+ | * validate and rank experimental results by the community | ||
+ | |||
+ | Current status of the framework: | ||
+ | * prototype is working and includes original cTuning1 framework and machine learning based compiler MILEPOST GCC that can continuously improve its own optimization heuristic using experimental data collected in the public repository from the community during crowd-tuning on Anrdoid mobiles, data servers and any other system. | ||
+ | * prototype is currently being tested and enhanced in several projects this summer for the pre-release in autumn 2013. | ||
+ | |||
+ | = Other comments = | ||
+ | |||
+ | * reward not only reproducibility of new results, but also public implementation and validation of older techniques to systematize computer engineering | ||
+ | |||
+ | * avoid vicious circle in computer engineering where number of publications is often more important than significance and reproducibility of results that causes further increase in citations (or H-index boosting by groups of researchers) forcing researchers to focus on "low hanging fruits" for quick publication rather than deep, long-term and reproducible research. | ||
+ | |||
+ | * allow to publish negative experimental results that allow community to avoid various pitfalls but practically impossible to publish right now. | ||
+ | |||
+ | * continue developing common R&D infrastructure for new workshops, conferences and journals since unlike online journals in machine learning and mathematics, where submitting and validating experimental results is relatively straightforward, we have to deal with very complex systems with varying behavior, myriads of ever changing complex tools and heterogeneous data. | ||
+ | |||
+ | * discuss methodology and criteria for statistical comparison of experimental results (varying program/system behavior across users). There are many publications on this topic, but we also need to understand and minimize causes of variation to add to our collaborative infrastructure rather than simply reporting speedups. | ||
+ | |||
+ | * do not just discuss all above ideas and then wait that someone else will implement them, but continue developing, sharing and discussing pilot infrastructure. | ||
+ | |||
+ | * make governments aware of our initiative to have dedicated funding development and support of public repositories and infrastructure for collaborative and reproducible R&D | ||
+ | ** we need professional developers since from our past experience interns or PhDs do not have enough experience or time to develop and support long-term infrastructure | ||
+ | |||
+ | = Conclusions and future actions = | ||
+ | |||
+ | * HiPEAC-OOCAM collaboration: since many goals with OCCAM are overlapping with the cTuning.org and Collective Mind initiatives, there is a strong interest to collaborate, exchange ideas and coordinate development, while pushing new publication model together for workshops, conferences and journals. | ||
+ | |||
+ | * Pre-release of the Collective Mind repository and infrastructure for the next event. | ||
+ | |||
+ | * Test/pre-release new plugin-based OpenME interface connected to Collective Mind (successor of Interactive Compilation Interface that is now included in mainline GCC) to expose any tool or application to universal auto-tuning and run-time adaptation. | ||
+ | |||
+ | * Prepare pilot workshop/conference/journal where experimental results will be validated - prepare methodology for validation and ranking of results. | ||
+ | |||
+ | * Address variation in experimental results across users during validation and ranking | ||
+ | |||
+ | * Arrange next thematic session to discuss next practical aspects of a common infrastructure and methodologies for validation and ranking of results. |
Current revision
Notes from the thematic session meeting on "making computer engineering a science" held at HiPEAC computing week/ACM ECRC in Paris on 2nd and 3rd of May, 2013
Summary of discussions
Many thanks to Bruce Childers, Alex Jones, Vittorio Zaccaria, Christian Bertin, Christophe Guillon, Christoph Reichenbach, Markus Puschel, Olivier Zendra, Daniel Gracia Perez, Hans Vandierendonck, Gert Jervan, Zbigniew Chamski, and all others for active participation in discussions.
1) Bruce Childers and Alex Jones presented long term OCCAM initiative recently started in the USA for open Curation for Computer Architecture Modeling:
- http://www.hipeac.net/system/files/childers_occam_open_curation_for_computer_architecture_modeling.pdf
- http://csa.cs.pitt.edu
The main ambitious goal of this effort is to build common collaborative infrastructure and repository for reproducible research and development in computer architecture modeling.
Current focus is to :
- determine governance structures
- develop evaluation methodology
- build a pilot repository
- demonstrate to community the value of an open repository
Low-volume mailing list is available (contact Bruce for more details)
Main concerns:
- infrastructure and repository should be developed by professional engineers with a long-term support!
- fair governing of this initiative (policies and procedure issues)
- persuading community to use such repository and share results will not be easy (need to show the value)
Related initiatives: nanoHUB (Nanotechnology), arXiv (physics and other sciences), EarthCub (Geosciences)
2) Vittorio Zaccaria discussed:
- possibility to speed up peer reviewed publishing using social networks and ranking (kind of reddit) as a possible filter that later used by endorsers and peer reviewers. Of course, there are many potential issues (trust, security, flaming, group interests, lobbying, etc) but it will be interesting to discuss it further since it can be a nice community instrument to help reviewers "focus" their effort and provide a few but high quality reviews.
- need to validate models against common datasets and also have a possibility to present negative results
Related initiatives: Kaggle, (GIGA)n Science, figshare
- http://www.hipeac.net/system/files/zaccaria_towards_european_research_v2.pdf
- http://www.vittoriozaccaria.net/data/20130502_hipeac_paris
3) Christian Bertin and Christophe.Guillon presented CARE tool being developed at STMicroelectronics (Comprehensive Archiving for Reproducible Execution) that tackles the following:
- difficulty to reproduce bug reports
- difficulty to reproduce someone's else experiment (missing of different architecture, host system, libraries, compilers, etc)
The tool relies on PRoot (http://proot.me) and provides 2 solutions:
- transparent archiving of only necessary files to reproduce experiment (space savings of up to 100 times in comparison with archiving the whole disk or virtual machine state)
- transparent sandboxing to reproduce execution
The tool is in a working/testing stage and there is a hope that it will be released as open source to the community. It can be a useful instrument for submitting/reproducing experimental setups.
4) Christoph Reichenbach presented STEP project - Software Tools Evaluation Platform to enable continuous and systematic tools reviews by the community.
It can help to:
- reward practical approaches and guide further developments
- find useful and robust tools that can be used in reproducible research
- validate usage scenarios not described in the paper
- identify tools that deserve long-term support or commercialization
Current discussions are about:
- reviewing criteria / minimization of workload
- motivation for the community to evaluate others tools
- avoiding fake meta-reviews
- collaborating with DataMill project: http://datamill.uwaterloo.ca
- automation
It's a recent on-going project - feedback, platform contributors and potential reviewers and tool authors are welcome:
- http://www-staff.informatik.uni-frankfurt.de/~creichen/step.en.html
- http://www.hipeac.net/system/files/reichenbach_towards_continuous_evaluation_of_software_tools.pdf
5) I presented Collective Mind Project: open source, plugin-based repository and infrastructure for collaborative and reproducible R&D in design, optimization and run-time adaptation of computer systems.
It allows researchers and developers to:
- gradually decompose complex system into plugins with unified and exposed tuning choices, properties and characteristics at multiple granularity levels in existing systems (compiler flags, CUDA/OpenCL/OpenMP pragmas, fine-grain optimizations, etc)
- share any research artifact with a UID such as benchmarks, codelets, data sets, tools, predictive models, auto-tuning and run-time adaptation strategies
- cross-link external public or end-user repositories
- prepare any research scenario through customized and reproducible experimental pipelines
- evaluate various machine learning and data mining techniques on public experimental data
- validate and rank experimental results by the community
Current status of the framework:
- prototype is working and includes original cTuning1 framework and machine learning based compiler MILEPOST GCC that can continuously improve its own optimization heuristic using experimental data collected in the public repository from the community during crowd-tuning on Anrdoid mobiles, data servers and any other system.
- prototype is currently being tested and enhanced in several projects this summer for the pre-release in autumn 2013.
Other comments
- reward not only reproducibility of new results, but also public implementation and validation of older techniques to systematize computer engineering
- avoid vicious circle in computer engineering where number of publications is often more important than significance and reproducibility of results that causes further increase in citations (or H-index boosting by groups of researchers) forcing researchers to focus on "low hanging fruits" for quick publication rather than deep, long-term and reproducible research.
- allow to publish negative experimental results that allow community to avoid various pitfalls but practically impossible to publish right now.
- continue developing common R&D infrastructure for new workshops, conferences and journals since unlike online journals in machine learning and mathematics, where submitting and validating experimental results is relatively straightforward, we have to deal with very complex systems with varying behavior, myriads of ever changing complex tools and heterogeneous data.
- discuss methodology and criteria for statistical comparison of experimental results (varying program/system behavior across users). There are many publications on this topic, but we also need to understand and minimize causes of variation to add to our collaborative infrastructure rather than simply reporting speedups.
- do not just discuss all above ideas and then wait that someone else will implement them, but continue developing, sharing and discussing pilot infrastructure.
- make governments aware of our initiative to have dedicated funding development and support of public repositories and infrastructure for collaborative and reproducible R&D
- we need professional developers since from our past experience interns or PhDs do not have enough experience or time to develop and support long-term infrastructure
Conclusions and future actions
- HiPEAC-OOCAM collaboration: since many goals with OCCAM are overlapping with the cTuning.org and Collective Mind initiatives, there is a strong interest to collaborate, exchange ideas and coordinate development, while pushing new publication model together for workshops, conferences and journals.
- Pre-release of the Collective Mind repository and infrastructure for the next event.
- Test/pre-release new plugin-based OpenME interface connected to Collective Mind (successor of Interactive Compilation Interface that is now included in mainline GCC) to expose any tool or application to universal auto-tuning and run-time adaptation.
- Prepare pilot workshop/conference/journal where experimental results will be validated - prepare methodology for validation and ranking of results.
- Address variation in experimental results across users during validation and ranking
- Arrange next thematic session to discuss next practical aspects of a common infrastructure and methodologies for validation and ranking of results.