Making computer engineering a science; Systematizing program and system analysis and optimization using auto-tuning, machine learning and crowdsourcing; Enabling self-tuning computer systems

If you are interested in preview, tutorial or presentation about cTuning/Collective Mind crowd-tuning and machine learning technology, its current industrial usages or new publication model, do not hesitate to get in touch with Grigori Fursin!

cTuning-related recent and upcoming events and news:

Education

New publication model

With our background in physics, we found it extremely disappointing that reproducibility, sharing and statistical mindfulness of results in computer engineering is rarely considered. In fact, it is often simply impossible due to lack of common tools and data repositories. Therefore, rather than complaining or just speaking about that, we spent many years developing cTuning technology that may help to systematize computer engineering and create a new publication model which favors sharing of data, models, tools and interfaces for validation and reproducibility by the community.

Since 2008 we released all our benchmarks (cBench), datasets (cDatasets/KDataSets), tools, and experimental data used in our publications in the cTuning repository in most of our publications since 2008. Since 2005, we also made our cTuning-related lectures available on-line. This model resulted in multiple collaborative projects to improve predictive models and tools to design and optimize computer systems together with IBM (Israel), Google (USA), ICT (China), University of Edinburgh (UK), UPC (Spain), CAPS Entreprise (France), ISP RAS (Russia), Intel (Illinois), Ghent University (Belgium), UVSQ (France), NCAR (USA), ARC/Synopsys (UK) and others.This topic has been accepted for the HiPEAC3 network of excellence (2012-2015) and we are currently building a community around this model. If you are interested to join this collaborative effort and our cTuning virtual Lab, please contact Grigori Fursin.

Public discussions

Online lectures

University lectures

We teach M2R course (with exam) at Paris South University on "Future Computing Systems".

All slides and demos are available here.

Workshops

Thematic sessions/panels/tutorials/BOFs

Publications

  1. [CFHP2012] Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, Chengyong Wu. Deconstructing Iterative Optimization.

    To appear in the next ACM TACO journal

  2. [FKMP2011] Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Chris Williams, Michael O'Boyle. MILEPOST GCC: machine learning enabled self-tuning compiler.
    International Journal of Parallel Programming (IJPP), June 2011, Volume 39, Issue 3, pages 296-327

    Concept is included in the HiPEAC 2012-2020 research roadmap.

    [bib] [Springer online final version] [author's pdf]

  3. [FHMP2011] Grigori Fursin, Robert Hundt, Jason Mars, Yuriy Kashnikov. Introducing ACM SIGPLAN International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era (http://exadapt.org).
    ACM International Conference Proceeding Series, co-located with PLDI, June 2011, San Jose, USA

    [ACM DL online version]

  4. [FT2010] Grigori Fursin and Olivier Temam. Collective Optimization: A Practical Collaborative Approach.
    ACM Transactions on Architecture and Code Optimization (TACO), December 2010, Volume 7, Number 4, pages 20-49

    Concept is included in the HiPEAC 2012-2020 research roadmap.

    [bib] [ACM DL final version] [author's pdf]

  5. [MCFP2010] Mircea Namolaru, Albert Cohen, Grigori Fursin, Ayal Zaks and Ari Freund. Practical Aggregation of Semantical Program Properties for Machine Learning Based Optimization.
    Proceedings of the International Conference on Compilers, Architecture, And Synthesis For Embedded Systems (CASES 2010), October 2010, Scottsdale, AZ, USA

    [bib] [pdf]

  6. [YYLP2010] Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, Chengyong Wu. Evaluating Iterative Optimization across 1000 Data Sets.
    Proceedings of the ACM SIGPLAN 2010 Conference on Programming Language Design and Implementation (PLDI 2010), June 2010, Toronto, Canada (acceptance rate: 20%, 41/204)

    HiPEAC paper award.

    [bib] [pdf]

  7. [HPWP2010] Yuanjie Huang, Liang Peng, Chengyong Wu, Yuriy Kashnikov, Jörn Renneke, and Grigori Fursin. Transforming GCC into a research-friendly environment: plugins for optimization tuning and reordering, function cloning and program instrumentation.
    2nd International Workshop on GCC Research Opportunities (GROW’10) co-located with HiPEAC'10, Pisa, Italy, January 2010 (acceptance rate: 57%, 8/14)

    [bib] [pdf] [pdf backup]

  8. [FT2009] Grigori Fursin and Olivier Temam. Collective optimization.
    Proceedings of the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2009), Paphos, Cyprus, January 2009 (acceptance rate: 28%, 27/97)

    Extended version is now published in ACM TACO (FT2010).

    [bib] [pdf]

  9. [DJBP2009] Christophe Dubach, Timothy M. Jones, Edwin V. Bonilla, Grigori Fursin, and Michael F.P. O'Boyle. Portable Compiler Optimization Across Embedded Programs and Microarchitectures using Machine Learning.
    Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), New York, USA, December 2009 (acceptance rate: 25%, 52/209)

    HiPEAC paper award.

    Christophe Dubach received BCS/CPHC Distinguished Dissertation Award’09 for his related thesis "Using Machine-Learning to Efficiently Explore the Architecture/Compiler Co-Design Space" supervised by Prof. Michael O'Boyle.

    [bib] [pdf]

  10. [Fur2009] Grigori Fursin. Collective Tuning Initiative: automating and accelerating development and optimization of computing systems.
    Proceedings of the GCC Summit'09, Montreal, Canada, June 2009

    This paper introduces collective tuning infrastructure (http://cTuning.org) and repository (http://cTuning.org/cdatabase) to start continuous parameterization of all computing systems and to automate, simplify and systematize code and architecture design, characterization and optimization. Collecting enough data about various architectures, compilers, programs, benchmarks, kernels and datasets will help to quickly predict better program optimizations or architecture designs using machine learning techniques thus considerably reducing time to market and enabling self-tuning, adaptive computing systems.

    [bib] [pdf] [pdf backup]

  11. [TOFP2009] John Thomson, Michael O'Boyle, Grigori Fursin and Björn Franke. Reducing Training Time in a One-shot Machine Learning-based Compiler.
    Proceedings of the 22nd International Workshop on Languages and Compilers for Parallel Computing (LCPC'09), Newark, Delaware, USA, October 2009

    [bib] [pdf]

  12. [LCWP2009] Lianjie Luo, Yang Chen, Chengyong Wu, Shun Long and Grigori Fursin. Finding representative sets of optimizations for adaptive multiversioning applications.
    3rd International Workshop on Statistical and Machine Learning Approaches Applied to Architectures and Compilation (SMART'09) co-located with HiPEAC'09, Paphos, Cyprus, January 2009 (acceptance rate=62%, 8/13)

    [bib] [pdf] [pdf backup] [presentation]

  13. [JGVP2009] Victor Jimenez, Isaac Gelado, Lluis Vilanova, Marisa Gil, Grigori Fursin and Nacho Navarro. Predictive runtime code scheduling for heterogeneous architectures.
    Proceedings of the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2009), Paphos, Cyprus, January 2009 (acceptance rate: 28%, 27/97)

    Grigori Fursin prepared and obtained a HiPEAC collaborative grant to develop this technique with Victor Jimenez.

    Similar approaches for gluing/adapting applications for heterogeneous architectures are used in Intel’s Qilin and in CAPS Entreprise’s HMPP.

    [bib] [pdf] [presentation]

  14. [LF2009] Shun Long and Grigori Fursin. Systematic search within an optimisation space based on Unified Transformation Framework.
    International Journal of Computational Science and Engineering (IJCSE), Vol.4, No.2, pages 102-111, 2009 (submitted in 2005)

    [bib] [pdf]

  15. [FMTP2008] Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Courtois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris Williams, Michael O'Boyle. MILEPOST GCC: machine learning based research compiler.
    Proceedings of the GCC Developers' Summit, Ottawa, Canada, June 2008

    Extended version is now published in IJPP (FKMP2011).

    [bib] [pdf] [pdf backup]

  16. [DFGP2007] Veerle Desmet, Grigori Fursin, Sylvain Girbal and Olivier Temam. Leveraging Modular Simulation for Systematic Design Space Exploration.
    4th HiPEAC Industrial Workshop on Compilers and Architectures organized by ARM Ltd., Cambridge, UK, November 2007

    [bib]

  17. [LCFP2007] Piotr Lesnicki, Albert Cohen, Grigori Fursin, Marco Cornero, Andrea Ornstein and Erven Rohou. Split Compilation: an Application to Just-in-Time Vectorization.
    International Workshop on GCC for Research in Embedded and Parallel Systems (GREPS'07) in conjunction with PACT'07, Brasov, Romania, September 2007

    [bib] [pdf] [pdf backup]

  18. [LFF2007] Shun Long, Grigori Fursin, Björn Franke. A Cost-Aware Parallel Workload Allocation Approach based on Machine Learning Techniques.
    Proceedings of the IFIP International Conference on Network and Parallel Computing (NPC 2007), LNCS-4672, pages 506-515, Dalian, China, September 2007

    [bib] [pdf]

  19. [FMPP2007] Grigori Fursin, Cupertino Miranda, Sebastian Pop, Albert Cohen and Olivier Temam. Practical Run-time Adaptation with Procedure Cloning to Enable Continuous Collective Compilation.
    Proceedings of the GCC Developers' Summit, Ottawa, Canada, July 2007

    [bib] [pdf]

  20. [DCFP2007] Christophe Dubach, John Cavazos, Björn Franke, Grigori Fursin, Michael O'Boyle and Oliver Temam. Enabling fast compiler optimization evaluation via code-features based performance predictor.
    Proceedings of the ACM International Conference on Computing Frontiers, Ischia, Italy, May 2007 (acceptance rate=50%,28/56)

    [bib] [pdf]

  21. [CFAP2007] John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael F.P.O'Boyle and Olivier Temam. Rapidly Selecting Good Compiler Optimizations using Performance Counters.
    Proceedings of the 5th Annual International Symposium on Code Generation and Optimization (CGO), San Jose, USA, March 2007

    [bib] [pdf]

  22. [FC2007] Grigori Fursin and Albert Cohen. Building a Practical Iterative Interactive Compiler.
    1st International Workshop on Statistical and Machine Learning Approaches Applied to Architectures and Compilation (SMART'07) co-located with HiPEAC'07, Ghent, Belgium, January 2007 (acceptance rate=58%, 7/12)

    More info is now available in the extended version published in IJPP (FKMP2011).

    [bib] [pdf]

  23. [FCOP2007] Grigori Fursin, John Cavazos, Michael O'Boyle and Olivier Temam. MiDataSets: Creating The Conditions For A More Realistic Evaluation of Iterative Optimization.
    Proceedings of the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2007), Ghent, Belgium, January 2007 (acceptance rate=29%)

    [bib] [pdf]

  24. [FCOP2006] Grigori Fursin, Albert Cohen, Michael O'Boyle and Oliver Temam. Quick and practical run-time evaluation of multiple program optimizations.
    Transactions on High-Performance Embedded Architectures and Compilers, 1(1), pages 13-31, 2006

    [bib] [pdf] [pdf backup]

  25. [CDAP2006] John Cavazos, Christophe Dubach, Felix Agakov, Edwin Bonilla, Michael F.P. O'Boyle, Grigori Fursin and Olivier Temam. Automatic Performance Model Construction for the Fast Software Exploration of New Hardware Designs.
    Proceedings of the International Conference on Compilers, Architecture, And Synthesis For Embedded Systems (CASES 2006), Seoul, Korea, October 2006 (acceptance rate=41%, 41/100)

    finalist best paper award

    [bib] [pdf]

  26. [ABCP2006] F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M.F.P. O'Boyle, J. Thomson, M. Toussaint and C.K.I. Williams. Using Machine Learning to Focus Iterative Optimization.
    Proceedings of the 4th Annual International Symposium on Code Generation and Optimization (CGO), New York, NY, USA, March 2006 (acceptance rate=36%, 29/80)

    best presentation award

    See also our publication on MILEPOST GCC in IJPP (FKMP2011).

    [bib] [pdf]

  27. [FCOP2005] Grigori Fursin, Albert Cohen, Michael O'Boyle and Oliver Temam. A Practical Method For Quickly Evaluating Program Optimizations.
    Proceedings of the 1st International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2005), number 3793 in LNCS, pages 29-46, Barcelona, Spain, November 2005

    highest ranked paper, acceptance rate=20%,17/84

    This paper presents a novel concept to statically enable run-time optimizations and self-tuning binaries through function cloning and integrated low-overhead program/system behaviour monitoring routines. It has been referenced in patents and extended in academia and industry. For the first time, we utilized Interactive Compilation Interface for PathScale compiler with loop vectorization, tiling, unrolling, interchange, fission/fusion, pipelining, prefetching and array padding to make static binaries adaptable and reactive to various environments and run-time behaviour that is important to improve efficiency and cost of various HPC systems. Since 2007 it is being extended by Google for data centers (cloud computing) and by Intel Exascale Lab (France) to build adaptive applications and reconfigure processors at run-time to save power.

    Extensions for transparent collective optimization is available in our ACM TACO publication (FT2010).

    [bib] [pdf] [pdf backup]

  28. [FOTP2005] B. Franke, M. O'Boyle, J. Thomson and G. Fursin. Probabilistic Source-Level Optimisation of Embedded Systems Software.
    Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'05), pages 78-86, Chicago, IL, USA, June 2005 (acceptance rate=26%,25/95)

    [bib] [pdf]

  29. [LF2005] Shun Long and Grigori Fursin. A heuristic search algorithm based on Unified Transformation Framework.
    Proceedings of the 7th International Workshop on High Performance Scientific and Engineering Computing (HPSEC-05), pages 137-144, Oslo, Norway, June 2005

    [bib] [pdf]

  30. [FOK2005] Grigori Fursin, Michael O'Boyle and Peter Knijnenburg. Evaluating Iterative Compilation. Lecture Notes in Computer Science, Volume 2481, pages 362-376, 2005

    [bib] [Springer Online version] [author's pdf]

  31. [FUR2004] Grigori Fursin. Iterative Compilation and Performance Prediction for Numerical Applications.
    Ph.D. thesis, University of Edinburgh, Edinburgh, UK, January 2004

    Based on FOTP2001, FOK2002, FOTP2004.

    In this thesis, Grigori Fursin introduced a novel and simple approach to quickly detect if program CPU or memory bound through breaking program semantics: we add or remove various assembler instructions to convert array accesses to scalars in various ways without preserving the semantics of the code while avoiding code crashing to be able to directly compare original and transformed programs. This technique does not need any slow simulation and proved to be realistic particularly on out-of-order processors where hardware counters can be totally misleading. This technique also advise how to optimize code, i.e. if code is CPU bound, we should focus on ILP optimizations; while if the code is memory bound, we should focus on polyhedral transformations or reduce processor frequency to save power.

    These techniques are currently actively used and extended in academia and industry including Intel Exascale Lab.

    [bib] [pdf] [pdf backup] [EOS software]

  32. [FOTP2004] Grigori Fursin, Mike O'Boyle, Olivier Temam, and Gregory Watts. Fast and Accurate Method for Determining a Lower Bound on Execution Time.
    Concurrency Practice and Experience, 16(2-3), pages 271-292, 2004

    In this paper, Grigori Fursin introduced a novel and simple approach to quickly detect if program CPU or memory bound through breaking program semantics: we add or remove various assembler instructions to convert array accesses to scalars in various ways without preserving the semantics of the code while avoiding code crashing to be able to directly compare original and transformed programs. This technique does not need any slow simulation and proved to be realistic particularly on out-of-order processors where hardware counters can be totally misleading. This technique also advise how to optimize code, i.e. if code is CPU bound, we should focus on ILP optimizations; while if the code is memory bound, we should focus on polyhedral transformations or reduce processor frequency to save power.

    [bib] [pdf]

  33. [FOK2002] G.G.Fursin, M.F.P.O'Boyle, and P.M.W. Knijnenburg. Evaluating Iterative Compilation.
    Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC'02), College Park, MD, USA, pages 305-315, 2002

    This paper introduces a concept of empirical optimization (iterative compilation or auto-tuning) of large applications to automatically adapt them to a given hardware using several basic search strategies. Our approach considerably outperformed state-of-art compilers on Intel, Alpha and several other popular architectures for several large SPEC applications. This technique has also laid the foundations for further research on focused optimizations using statistical techniques, machine learning, run-time adaptation and collective tuning.

    [bib] [pdf]

  34. [FOTP2001] Grigori Fursin, Mike O'Boyle, Olivier Temam, and Gregory Watts. Fast and Accurate Method for Determining a Lower Bound on Execution Time.
    Proceedings of the International Workshop on Compilers for Parallel Computers (CPC'01), pages 163-172, Edinburgh, 2001

    [bib] [pdf]

  35. [ATAP2000] Abella, J., S. A. Ali Touati, A. Anderson, C. Ciuraneta, J. M. Codina, Min Dai, C. Eisenbeis, G. Fursin, A. Gonzalez, J. Llosa, M. O'Boyle, A. Randrianatoavina, J. Sanchez, O. Temam, X. Vera, and G. Watts. MHAOTEU Tools for Memory Hierarchy Management.
    Proceedings of the 16th IMACS World Congress on Scientific Computation, Applied Mathematics and Simulation (IMACS'2000), Lausanne, Switzerland, August 2000.

Technical reports, national conferences and miscellaneous

  1. [FOTP2001] Grigori Fursin, Mike O’Boyle, Olivier Temam, and Gregory Watts. A Fast and Accurate Evaluation of a Memory Performance Upper-Bound.
    Report for the MHAOTEU ESPRIT project No 24942, February, 2001
  2. [ABBP2001] Jaume Abella, Cédric Bastoul, Jean-Luc Béchennec, Nathalie Drach, Christine Eisenbeis, Paul Feautrier, Björn Franke, Grigori Fursin, Antonio Gonzalez, Toru Kisku, Peter Knijnenburg, Josep Llosa, Michael O'Boyle, Julien Sébot, and Xavier Vera. Guided Transformations.
    Report M3.D2 for the MHAOTEU ESPRIT project No 24942, February 2001
  3. [AFGP2001] Jaume Abella, Grigori Fursin, Antonio Gonzalez, Josep Llosa, Michael O'Boyle, Abhishek Prabhat, Olivier Temam, Sid Ahmed Ali Touati, Xavier Vera, and Gregory Watts. Advanced Performance Analysis.
    Report M3.D2 for the MHAOTEU ESPRIT project No 24942, February, 2001.
  4. [FUR1997] Grigori Fursin. Simulation of processes of learning and recognition in modified neural network.
    Proceedings of the national conference on physical processes in devices of electronic and laser engineering, Moscow Institute of Physics & Technology, pages 102-111, Moscow, Russia, 1997
  5. [FUR1997] Grigori Fursin. Measurement of characteristics of neural elements with the aid of personal computer.
    Proceedings of the national conference on devices of electronic and laser engineering, Moscow Institute of Physics & Technology, pages 20-28, Moscow, Russia, 1997
  6. [FUR1995] Grigori Fursin. Restoration of symbols with noise by neural network.
    Proceedings of the national conference on physical processes in devices of electronic and laser engineering, Moscow Institute of Physics & Technology, pages 112-117, Moscow, Russia, 1995