From cTuning.org
Line 12: | Line 12: | ||
For the first time, Grigori utilized his [[CTools:ICI|Interactive Compilation Interface]] for [http://www.pathscale.com PathScale compiler] with loop vectorization, tiling, unrolling, interchange, fission/fusion, pipelining, prefetching and array padding to make static self-tuning binaries that can automatically learn from the past experience and adapt/react to various environments, run-time behavior and contentions that is important to improve efficiency and cost of both embedded systems and HPC data centers (cloud computing). | For the first time, Grigori utilized his [[CTools:ICI|Interactive Compilation Interface]] for [http://www.pathscale.com PathScale compiler] with loop vectorization, tiling, unrolling, interchange, fission/fusion, pipelining, prefetching and array padding to make static self-tuning binaries that can automatically learn from the past experience and adapt/react to various environments, run-time behavior and contentions that is important to improve efficiency and cost of both embedded systems and HPC data centers (cloud computing). | ||
- | This technique opened up many research possibilities, has been used in multiple research projects in collaboration with UPC, ICT, IBM, CAPS Enterprise, STMicro, has been supported by [http://cTuning.org/project-milepost MILEPOST], [http://www.hipeac.net HiPEAC] and [http://code.google.com/soc Google Summer of Code] grants, has been referenced in patents and has been extended to speed up iterative compilation ({{Ref|FCOP2005}}, {{Ref|FCOP2006}}), enable transparent continuous collective optimization ({{Ref|FT2009}},{{Ref|FMPP2007}}), enable portable program characterization techniques based on reactions to optimizations ({{Ref|FT2009}}), enable predictive scheduling for heterogeneous multicore systems ({{Ref|JGVP2009}}), enable adaptive libraries based on dataset characterization using machine learning and decision trees ({{Ref|LCWP2009}}) among many other usages based on continuous transparent run-time program optimization and adaptation as a reaction to dynamic changes in program behavior and environment. Since 2007 it is being actively extended by [http://research.google.com Google Inc.] for data centers. | + | This technique opened up many research possibilities, has been used in multiple research projects in collaboration with UPC, ICT, IBM, CAPS Enterprise, STMicro, has been supported by [http://cTuning.org/project-milepost MILEPOST], [http://www.hipeac.net HiPEAC] and [http://code.google.com/soc Google Summer of Code] grants, has been referenced in patents and has been extended to speed up iterative compilation ({{Ref|FCOP2005}}, {{Ref|FCOP2006}}), enable transparent continuous collective optimization ({{Ref|FT2010}}, {{Ref|FT2009}},{{Ref|FMPP2007}}), enable portable program characterization techniques based on reactions to optimizations ({{Ref|FT2010}}, {{Ref|FT2009}}), enable predictive scheduling for heterogeneous multicore systems ({{Ref|JGVP2009}}), enable adaptive libraries based on dataset characterization using machine learning and decision trees ({{Ref|LCWP2009}}) among many other usages based on continuous transparent run-time program optimization and adaptation as a reaction to dynamic changes in program behavior and environment. Since 2007 it is being actively extended by [http://research.google.com Google Inc.] for data centers. |
We are gradually working to move this framework to mainline GCC combined with [[CTools:ICI|ICI]] (though source-to-source adaptation framework can still be useful). We are working to provide a unified view of heterogeneous architectures and optimizations with a high-level abstraction layer (architectures, compilers, run-time systems) to automate and simplify program development and optimization for heterogeneous multi-core systems. We would like to use this framework to automatically detect contentions in computing systems and react to such changes. We also hope to provide a unified view of heterogeneous architectures (CPU/GPU, CELL-like, FPGA, accelerators), optimizations and data movement/partitioning with a high-level abstraction layer (architectures, compilers, run-time systems) to automate and simplify program development and optimization for heterogeneous multi-core systems. | We are gradually working to move this framework to mainline GCC combined with [[CTools:ICI|ICI]] (though source-to-source adaptation framework can still be useful). We are working to provide a unified view of heterogeneous architectures and optimizations with a high-level abstraction layer (architectures, compilers, run-time systems) to automate and simplify program development and optimization for heterogeneous multi-core systems. We would like to use this framework to automatically detect contentions in computing systems and react to such changes. We also hope to provide a unified view of heterogeneous architectures (CPU/GPU, CELL-like, FPGA, accelerators), optimizations and data movement/partitioning with a high-level abstraction layer (architectures, compilers, run-time systems) to automate and simplify program development and optimization for heterogeneous multi-core systems. | ||
Line 84: | Line 84: | ||
*'''2009.September.1''' - The documentation of [http://ctuning.org/milepost-gcc MILEPOST GCC]/[http://ctuning.org/ici GCC ICI] extensions by Yuanjie and Liang during GSOC'09 program is now available: [http://ctuning.org/wiki/index.php/CTools:ICI:Projects:GSOC09:Function_cloning_and_program_instrumentation Function cloning and program instrumentation] and [http://ctuning.org/wiki/index.php/CTools:ICI:Projects:GSOC09:Fine_grain_tuning Fine grain program tuning]. We would like to fully test and sync these developments with mainline GCC within next month or two. | *'''2009.September.1''' - The documentation of [http://ctuning.org/milepost-gcc MILEPOST GCC]/[http://ctuning.org/ici GCC ICI] extensions by Yuanjie and Liang during GSOC'09 program is now available: [http://ctuning.org/wiki/index.php/CTools:ICI:Projects:GSOC09:Function_cloning_and_program_instrumentation Function cloning and program instrumentation] and [http://ctuning.org/wiki/index.php/CTools:ICI:Projects:GSOC09:Fine_grain_tuning Fine grain program tuning]. We would like to fully test and sync these developments with mainline GCC within next month or two. | ||
- | *'''2009.August.05''' - The colleagues from the [http://unidapt.org UNIDAPT Group] started investigating the use of [http://ctuning.org cTuning]/[http://cTuning.org/project-milepost MILEPOST] technology and the [http://ctuning.org/unidapt UNIDAPT framework] to predict good optimization and parallelization schemes for hybrid heterogeneous CPU/GPU-like architectures together with [http://www.caps-entreprise.com CAPS Entreprise] based on run-time adaptation and profiling, empirical iterative compilation, statistical analysis, machine learning, program and dataset features and run-time decision trees ({{Ref|FT2009}}, {{Ref|LCWP2009}}, {{Ref|Fur2009}}, {{Ref|JGVP2009}}, {{Ref1|TWFP2009}}, {{Ref|FMTP2008}}, {{Ref|LFF2007}}, {{Ref|FCOP2005}}). They plan to add new optimization cases to the [http://ctuning.org/cdatabase Collective Optimization Database] in Autumn, 2009. | + | *'''2009.August.05''' - The colleagues from the [http://unidapt.org UNIDAPT Group] started investigating the use of [http://ctuning.org cTuning]/[http://cTuning.org/project-milepost MILEPOST] technology and the [http://ctuning.org/unidapt UNIDAPT framework] to predict good optimization and parallelization schemes for hybrid heterogeneous CPU/GPU-like architectures together with [http://www.caps-entreprise.com CAPS Entreprise] based on run-time adaptation and profiling, empirical iterative compilation, statistical analysis, machine learning, program and dataset features and run-time decision trees ({{Ref|FT2010}}, {{Ref|FT2009}}, {{Ref|LCWP2009}}, {{Ref|Fur2009}}, {{Ref|JGVP2009}}, {{Ref1|TWFP2009}}, {{Ref|FMTP2008}}, {{Ref|LFF2007}}, {{Ref|FCOP2005}}). They plan to add new optimization cases to the [http://ctuning.org/cdatabase Collective Optimization Database] in Autumn, 2009. |
*'''2009.July.27''' - The paper "Portable Compiler Optimization Across Embedded Programs and Microarchitectures using Machine Learning" ({{Ref|DJBP2009}}) has been accepted for the [http://www.microarch.org/micro42 42nd IEEE/ACM International Symposium on Microarchitecture (MICRO)]. The research has been led by the colleagues from the University of Edinburgh - congratulations! | *'''2009.July.27''' - The paper "Portable Compiler Optimization Across Embedded Programs and Microarchitectures using Machine Learning" ({{Ref|DJBP2009}}) has been accepted for the [http://www.microarch.org/micro42 42nd IEEE/ACM International Symposium on Microarchitecture (MICRO)]. The research has been led by the colleagues from the University of Edinburgh - congratulations! | ||
Line 98: | Line 98: | ||
*'''2009.June.26''' - The pdf of the paper that describes Collective Tuning Infrastructure and cTuning concept (presented at the GCC Summit'09) will be available in a few weeks [http://unidapt.org/index.php/Dissemination#Fur2009 here]. | *'''2009.June.26''' - The pdf of the paper that describes Collective Tuning Infrastructure and cTuning concept (presented at the GCC Summit'09) will be available in a few weeks [http://unidapt.org/index.php/Dissemination#Fur2009 here]. | ||
- | *'''2009.June.10''' - Extended version of the "Collective Optimization" paper ({{Ref|FT2009|}}) describing collective tuning concept has been accepted for ACM Transactions on Architecture and Code Optimization (TACO). | + | *'''2009.June.10''' - Extended version of the "Collective Optimization" paper ({{Ref|FT2010}}, {{Ref|FT2009|}}) describing collective tuning concept has been accepted for ACM Transactions on Architecture and Code Optimization (TACO). |
* '''2009.June.01''' - After nearly 1 year of developments we released/updated all our open-source collaborative [[CTools|R&D tools]]: | * '''2009.June.01''' - After nearly 1 year of developments we released/updated all our open-source collaborative [[CTools|R&D tools]]: |
Revision as of 00:06, 31 December 2010
![]() |
Universal Adaptation Framework |
Statically enabling run-time optimization and adaptation |
Web shortcut: http://cTuning.org/unidapt Navigation: cTuning.org > CTools UNIDAPT concept has been developed during 2004-2006 by Grigori Fursin in collaboration with Olivier Temam to statically enable run-time optimizations and self-tuning binaries through cloning of program hot spots, applying various aggressive optimizations to clones for different optimization cases (that may improve performance/power/fault-tolerance, etc), statically integrating low-overhead program/system behaviour monitoring routines (using hardware counters) and selecting appropriate versions at run-time as a reaction to different program behavior, architectural changes or contentions. For the first time, Grigori utilized his Interactive Compilation Interface for PathScale compiler with loop vectorization, tiling, unrolling, interchange, fission/fusion, pipelining, prefetching and array padding to make static self-tuning binaries that can automatically learn from the past experience and adapt/react to various environments, run-time behavior and contentions that is important to improve efficiency and cost of both embedded systems and HPC data centers (cloud computing). This technique opened up many research possibilities, has been used in multiple research projects in collaboration with UPC, ICT, IBM, CAPS Enterprise, STMicro, has been supported by MILEPOST, HiPEAC and Google Summer of Code grants, has been referenced in patents and has been extended to speed up iterative compilation (FCOP2005, FCOP2006), enable transparent continuous collective optimization (FT2010, FT2009,FMPP2007), enable portable program characterization techniques based on reactions to optimizations (FT2010, FT2009), enable predictive scheduling for heterogeneous multicore systems (JGVP2009), enable adaptive libraries based on dataset characterization using machine learning and decision trees (LCWP2009) among many other usages based on continuous transparent run-time program optimization and adaptation as a reaction to dynamic changes in program behavior and environment. Since 2007 it is being actively extended by Google Inc. for data centers. We are gradually working to move this framework to mainline GCC combined with ICI (though source-to-source adaptation framework can still be useful). We are working to provide a unified view of heterogeneous architectures and optimizations with a high-level abstraction layer (architectures, compilers, run-time systems) to automate and simplify program development and optimization for heterogeneous multi-core systems. We would like to use this framework to automatically detect contentions in computing systems and react to such changes. We also hope to provide a unified view of heterogeneous architectures (CPU/GPU, CELL-like, FPGA, accelerators), optimizations and data movement/partitioning with a high-level abstraction layer (architectures, compilers, run-time systems) to automate and simplify program development and optimization for heterogeneous multi-core systems. You are welcome to join the project, provide feedback and help with developments. ![]() ![]() ![]() |
|
You are welcome to join us and participate in discussions, developments or provide feedback and suggestions to extend UNIDAPT Framework.