CTools:MilepostFramework

From cTuning.org

Revision as of 11:00, 29 June 2009 by Gfursin (Talk | contribs)

(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)

Navigation: cTuning.org > CTools

Original development of the MILEPOST GCC and MILEPOST Framework has been coordinated by Grigori Fursin and the UNIDAPT group. In June, 2009, MILEPOST Framework and MILEPOST GCC have been integrated with the cTuning tools: Collective Optimization Database, cTuning optimization prediction web-services, Interactive Compilation Interface for GCC, Continuous Collective Compilation Framework and released to enable further collaborative community-driven developments after the end of the MILEPOST project (August 2009).

MILEPOST infrastructure has been used to tune default compiler optimization heuristic or find "good" program optimizations or architectural configurations for reconfigurable processors entirely automatically using statistical and machine learning techniques.

You are warmly welcome to join cTuning community and follow/participate in developments and discussions using cTuning Wiki-based portal and 2 mailing lists: high volume development list and low volume announcement list.

More details about our current and future developments can be found in the following publications:

MILEPOST framework to transform GCC into a powerful machine learning enabled research tool suitable for adaptive computing. It uses a number of components including (i) a machine learning enabled MILEPOST GCC with Interactive Compilation Interface (ICI) and program feature extractor to modify internal optimization decisions (ii) a Continuous Collective Compilation Framework (CCC) to search for good combinations of optimizations and (iii) a Collective Optimization Database (COD) to record compilation and execution statistics. Such information is later used as training data for the machine learning models. ICI controls the internal optimization decisions and their parameters using external plugins. It now allows the complete substitution of default internal optimization heuristics as well as the order of transformations. CCC Framework produces a training set for machine learning models to learn how to optimize programs for the best performance, code size, power consumption and any other objective function needed by the end-user. This framework allows knowledge of the optimization space to be reused among different programs, architectures and data sets.

The MILEPOST Framework currently proceeds in two distinct phases, in accordance with typical machine learning practice: training and deployment:

Training: During the training phase we need to gather information about the structure of programs and record how they behave when compiled under different optimization settings. Such information allows machine learning tools to correlate aspects of program structure, or features, with optimizations, building a strategy that predicts a good combination of optimizations.

In order to train a useful model a large number of compilations and executions as training examples are needed. These training examples are generated by the CCC Framework which evaluates different compilation optimizations, storing execution time, code size and other metrics in a database. The features of the program are extracted from MILEPOST GCC and stored in COD. ICI Plugins allow fine grained control and examination of the compiler, driven externally through shared libraries.

Deployment: Once sufficient training data is gathered, a machine learning model is created. This model is able to predict good optimization strategies for a given set of program features and is built as a plugin so that it can be re-inserted into MILEPOST GCC. On encountering a new program the plugin determines the program's features, passing them to the model which determines the optimizations to be applied.

Original MILEPOST partners: