CTools:MilepostFramework

From cTuning.org

NEWS: Since 2015, we moved all related developments, benchmarks, data sets and tools to our new Collective Knowledge Framework!

Navigation: cTuning.org > CTools

During 2009, MILEPOST framework has been fully integrated with cTuning infrastructure and has been discontinued!

NOTES:

Reference publications about cTuning.org long-term vision: GCC Summit'09, ACM TACO'10 journal and IJPP'11 journal.

cTuning Google discussions list

Old page:

Download MILEPOST GCC 4.4.0

Google Summer of Code'09 MILEPOST GCC extensions and plugins (XML representation of the compilation flow, fine-grain optimizations and instrumentation, polyhedral transformations, run-time adaptation)
- Development page for fine-grain tuning from GSOC'09
  - documentation
- Development page for function cloning from GSOC'09
  - documentation

The development of the MILEPOST GCC and MILEPOST Framework has been coordinated by Dr. Grigori Fursin (UNIDAPT group, INRIA, France) within EU FP6 MILEPOST project (2006-2009). MILEPOST consortium includes INRIA, IBM Haifa, University of Edinburgh, ARC International Ltd. and CAPS Entreprise. Don't hesitate to contact Grigori if you have any questions or comments about MILEPOST framework or cTuning initiative.

MILEPOST infrastructure helps MILEPOST GCC substitute default optimization heuristic (optimization levels such as -O1,-O2,-O3,-Os etc) of GCC with the optimization prediction plugins that are continuously trained on multiple benchmarks with automatic correlation of so-called static program features (some general aspects of a program structure) or dynamic program features (hardware counters capturing run-time program/dataset behavior) with good combinations of optimizations.

MILEPOST GCC is the first machine learning enabled open-source self-tuning research compiler that can adapt to any architecture using iterative feedback-directed compilation, machine learning and collective optimization. It combines the strength of the production quality GCC that supports more than 30 families of architectures and can compile real, large applications including Linux, and the flexibility of the Interactive Compilation Interface that transforms GCC into a research compiler. It is currently based on predictive modeling using program and machine-specific features, execution time, hardware counters and off-line training. MILEPOST GCC includes static program feature extractor developed by IBM Haifa. MILEPOST/cTuning technology is orthogonal to GCC and can be used in any future adaptive self-tuning compiler using common Interactive Compilation Interface.

In June, 2009, MILEPOST GCC has been released and all further developments have been integrated with the cTuning tools: Collective Optimization Database, cTuning optimization prediction web-services, Interactive Compilation Interface for GCC, Continuous Collective Compilation Framework to enable collaborative community-driven developments after the end of the MILEPOST project (August 2009). You are warmly welcome to join cTuning community and follow/participate in developments and discussions using cTuning Wiki-based portal and 2 mailing lists: high volume development list and low volume announcement list.

We don't claim that MILEPOST GCC, MILEPOST Framework and cTuning tools can solve all optimization problems ;) but we believe that having an open research-friendly extensible compiler with machine learning and adaptive plugins based on production quality GCC that supports multiple languages and architectures opens up many research opportunities for the community and is the first practical step towards our long-term objective to enable adaptive self-tuning computing systems. With the help of the community, we hope to provide better validation of code correctness when applying complex combinations of optimizations, provide plugins for XML representation of the compilation flow, tuning of fine-grain optimizations/polyhderal GRAPHITE transformations/link-time optimizations, code instrumentation and run-time adaptation capabilities for statically compiled programs (see Google Summer of Code'09 program). We would also like to add support to MILEPOST GCC/cTuning tools to be able to optimize whole Linux (Gentoo-like) or optimize programs for mobile systems on the fly (for example, using Android, Moblin, etc) and extend this technology to enable realistic adaptive parallelization, data partitioning and scheduling for heterogeneous multi-core systems using statistical and machine learning techniques.

We are very grateful to all our colleagues and users for providing valuable feedback or contributing to the cTuning/MILEPOST projects.

Note: cTuning is an ongoing evolving project - please be patient and tolerant to the community and help us with this collaborative effort!

More details about our current and future developments can be found in the following publications:

Brief info:

MILEPOST framework transforms GCC into a powerful machine learning enabled research infrastructure suitable for adaptive computing. It uses a number of components including (i) a machine learning enabled MILEPOST GCC with Interactive Compilation Interface (ICI) and program feature extractor to modify internal optimization decisions (ii) a Continuous Collective Compilation Framework (CCC) to search for good combinations of optimizations and (iii) a Collective Optimization Database (COD) to record compilation and execution statistics. Such information is later used as training data for the machine learning models. ICI controls the internal optimization decisions and their parameters using external plugins. It now allows the complete substitution of default internal optimization heuristics as well as the order of transformations. CCC Framework produces a training set for machine learning models to learn how to optimize programs for the best performance, code size, power consumption and any other objective function needed by the end-user. This framework allows knowledge of the optimization space to be reused among different programs, architectures and data sets.

The MILEPOST Framework currently proceeds in two distinct phases, in accordance with typical machine learning practice: training and deployment:

Training: During the training phase we need to gather information about the structure of programs and record how they behave when compiled under different optimization settings. Such information allows machine learning tools to correlate aspects of program structure, or features, with optimizations, building a strategy that predicts a good combination of optimizations.

In order to train a useful model a large number of compilations and executions as training examples are needed. These training examples are generated by the CCC Framework which evaluates different compilation optimizations, storing execution time, code size and other metrics in a database. The features of the program are extracted from MILEPOST GCC and stored in COD. ICI Plugins allow fine grained control and examination of the compiler, driven externally through shared libraries.

Deployment: Once sufficient training data is gathered, a machine learning model is created. This model is able to predict good optimization strategies for a given set of program features and is built as a plugin so that it can be re-inserted into MILEPOST GCC. On encountering a new program the plugin determines the program's features, passing them to the model which determines the optimizations to be applied.

Typical non-trivial distribution of optimization points in the 2D space of speedups vs code size of a susan_corners program on AMD Athlon64 3700+ architecture with GCC 4.2.2 during automatic program optimization using ccc-run-glob-flags-rnd-uniform plugin from CCC framework with uniform random combinations of more than 100 global compiler flags (each flag has 50% probability to be selected for a given combination of optimizations). Similar data for other benchmark, datasets and architectures is available in the Collective Optimization Database. cTuning technology helps users find or predict optimial optimization points in such complex spaces using "one-button" approach.

Original MILEPOST partners: