|
|
| (4 intermediate revisions not shown.) |
| Line 3: |
Line 3: |
| | {{CMenu:CTools|MilepostGCC}} | | {{CMenu:CTools|MilepostGCC}} |
| | | | |
| - | = MILEPOST 1.5 GCC 4.4.0 =
| + | * [[CTools:MilepostGCC:Documentation:MILEPOST_V2.1|MILEPOST GCC V2.1 (GCC 4.4.0, 4.4.1, 4.4.2, 4.4.3)]] |
| - | | + | * MILEPOST GCC V1.5 & V2.0 - unreleased, internal versions |
| - | === License ===
| + | * [[CTools:MilepostGCC:Documentation:MILEPOST_V1.0|MILEPOST GCC V1.0 (GCC 4.4.0)]] |
| - | | + | |
| - | This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 3 as published by the Free Software Foundation.
| + | |
| - | | + | |
| - | This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the [http://www.gnu.org/copyleft/gpl.html GNU General Public License for more details].
| + | |
| - | | + | |
| - | If you found this software useful, you are welcome to reference http://cTuning.org website and these publications {{Ref|FMTP2008}},{{Ref|Fur2009}},{{Ref|FT2009}} in your derivative works.
| + | |
| - | | + | |
| - | === Authors ===
| + | |
| - | | + | |
| - | * [http://fursin.net/research Grigori Fursin] (INRIA, France) - original design of the MILEPOST/ICI/cTuning framework
| + | |
| - | * Mircea Namolaru (IBM Research Lab, Israel) - feature extractor pass
| + | |
| - | | + | |
| - | === Framework high-level overview ===
| + | |
| - | | + | |
| - | <div align="left">http://ctuning.org/wiki/images/img-milepost-gcc-structure1.gif</div>
| + | |
| - | | + | |
| - | === History ===
| + | |
| - | MILEPOST GCC V1.5 (4.4.0) - TBA - Fully updated compiler that includes
| + | |
| - | parts of CCC framework and can communicate
| + | |
| - | with cTuning web-services to predict good optimization
| + | |
| - | cases to improve execution time, code size and compilation time
| + | |
| - | using correlation between program features and optimizations.
| + | |
| - | | + | |
| - | MILEPOST GCC 4.4.0 - 20090629 - New official version of MILEPOST GCC with new ICI v2.0
| + | |
| - | and updated static feature extractor.
| + | |
| - | | + | |
| - | MILEPOST GCC 4.2.2 - 20080613 - Stable MILEPOST GCC version used in most MILEPOST Year 3 experiments.
| + | |
| - | | + | |
| - | === Requirements ===
| + | |
| - | | + | |
| - | In order to install MILEPOST GCC, you will need:
| + | |
| - | | + | |
| - | * C compiler that can compile [http://gcc.gnu.org GCC 4.x].
| + | |
| - | * uuid or uudigen tool to generate unique identifiers.
| + | |
| - | * [http://www.php.net PHP] ''(needed to communicate with cTuning web-services)''.
| + | |
| - | | + | |
| - | === Directory structure ===
| + | |
| - | | + | |
| - | gcc-4.4.0 - MILEPOST GCC 4.4.0 source directory (core + gfortran)
| + | |
| - | | + | |
| - | ccc-framework - MILEPOST V1.5 wrapper and necessary tools to communicate
| + | |
| - | with cTuning web-services (standard part of CCC framework)
| + | |
| - | | + | |
| - | src-third-party - Third party support tools
| + | |
| - | |
| + | |
| - | +-- gmp-4.3.0 - GMP library
| + | |
| - | +-- mpfr-2.4.1 - MPFR library
| + | |
| - | +-- ppl-0.10.2 - PPL library (for GRAPHITE)
| + | |
| - | +-- cloog - CLOOG library (for GRAPHITE)
| + | |
| - | +-- XSB - Prolog for machine learning tools (MILEPOST, UNIDAPT, cTuning)
| + | |
| - | | + | |
| - | plugins-ici-2.0 - Plugins for GCC 4.4.0 ICI 2.0 (see README inside this directory)
| + | |
| - | | + | |
| - | demo - Demo files for MILEPOST GCC V1.5
| + | |
| - | | + | |
| - | install - Directory with installed binaries
| + | |
| - | | + | |
| - | === Installation ===
| + | |
| - | | + | |
| - | First, check in all scripts that you have the same BUILD_EXT variable
| + | |
| - | that points to the install directory! You may have different names
| + | |
| - | if you install MILEPOST GCC for several architectures on the shared
| + | |
| - | file system ...
| + | |
| - | | + | |
| - | Invoke:
| + | |
| - | ./_build_gcc.sh to build GCC with all the third-party tools.
| + | |
| - | ./_build_ccc.sh to build CCC framework with MILEPOST GCC wrapper.
| + | |
| - | | + | |
| - | ./_build_plugins.sh will build all non-machine learning plugins.
| + | |
| - | ./_build_plugins_ml.sh will build all machine learning plugins.
| + | |
| - | | + | |
| - | === General configuration ===
| + | |
| - | | + | |
| - | Check ./_set_environment_for_milepost_gcc.sh - normally all environment
| + | |
| - | variables should be already properly set (check variable CCC_UUID -
| + | |
| - | the uuid tool). You have to source this file before using MILEPOST GCC .
| + | |
| - | | + | |
| - | File ./_set_environment_for_milepost_gcc.sh sets up environment
| + | |
| - | variables for low-level ICI tests and should also be already properly
| + | |
| - | set. If you plan to use only high-level MILEPOST GCC, you can skip it.
| + | |
| - | | + | |
| - | === Configuration for demos ===
| + | |
| - | | + | |
| - | * You can find how to use MILEPOST GCC using bitcount benchmark in the demo directory: /demo/bitcount. | + | |
| - | | + | |
| - | You need to first configure environment variables in the
| + | |
| - | ___common_environment.sh which are user-dependent:
| + | |
| - | | + | |
| - | CCC_CTS_USER and CCC_CTS_PASS should be set to your username and password when
| + | |
| - | self-registering at http://ctuning.org/wiki/index.php/Special:UserLogin
| + | |
| - | | + | |
| - | NOW YOU CAN TEST MILEPOST GCC wrapper and communication with the cTuning database
| + | |
| - | by invoking __test_milepost_gcc.sh. If everything is installed correctly, you
| + | |
| - | should get a response from the cTuning web-service: "Test passed successfully".
| + | |
| - | | + | |
| - | In order to continue using MILEPOST GCC, you can check the following variables:
| + | |
| - | Note that they already have default parameters so you do not have to change that
| + | |
| - | unless you want to tune MILEPOST GCC:
| + | |
| - | | + | |
| - | CCC_CTS_URL=cTuning.org/wiki/index.php/Special:CDatabase?request=
| + | |
| - | - points to the cTuning web-service.
| + | |
| - | | + | |
| - | CCC_CTS_DB=cod_opt_cases - points to the database with optimization cases
| + | |
| - | from the community.
| + | |
| - | | + | |
| - | ICI_PLUGIN_VERBOSE=1 - if set to 1, additional diagnostic information from ICI plugins.
| + | |
| - | ICI_VERBOSE=1 - if set to 1, additional diagnostic information from ICI.
| + | |
| - | | + | |
| - | | + | |
| - | ICI_PROG_FEAT_PASS=fre - sets pass after which to extract static program features.
| + | |
| - | | + | |
| - | CCC_COMPILER_FEATURES_ID=129504539516446542 - sets compiler ID which was used
| + | |
| - | to extract static program features for all programs
| + | |
| - | at cTuning.org. Do not changed it unless you really
| + | |
| - | understand what you are doing ;) !..
| + | |
| - | | + | |
| - | CCC_OPTS="-O3" - sets combination of flags to be used if cTuning prediction web-service
| + | |
| - | did not return optimization flags.
| + | |
| - | | + | |
| - | CCC_OPT_ARCH_USE=1 - if set to 1, MILEPOST GCC will also use architecture-dependent flags
| + | |
| - | (such as -march=athlon64) from cTuning.org. If set to 0, architecture
| + | |
| - | dependent flags will be ignored.
| + | |
| - | | + | |
| - | TIME_THRESHOLD=0.3 - when calculating speedups at cTuning.org, only optimization cases
| + | |
| - | with EXECUTION TIME more than this threshold are considered.
| + | |
| - | | + | |
| - | NOTES= - when <>"", only those optimization cases are returned that have this NOTES.
| + | |
| - | | + | |
| - | PG_USE=0 - if set to 1, only those optimization cases are returned that have function and other
| + | |
| - | level profiling. If unset or set to 0, use only those cases that do not have profiling
| + | |
| - | to avoid speedup skewing due to profiling.
| + | |
| - | | + | |
| - | OUTPUT_CORRECT=1 - if set to 1, only those optimization cases are returned that have been
| + | |
| - | checked for correctness by comparing benchmark outputs for the original
| + | |
| - | and transformed program (note that it still does not guarantee that
| + | |
| - | the combination of optimizations is correct, but it helps to reduce
| + | |
| - | obvious wrong cases).
| + | |
| - | | + | |
| - | RUN_TIME=RUN_TIME - sets which execution time to use when calculating speedups
| + | |
| - | (RUN_TIME - overall program execution time,
| + | |
| - | while RUN_TIME USER - only user execution time)
| + | |
| - | | + | |
| - | SORT=012 - when predicting optimizations, the best combinations of optimizations
| + | |
| - | are selected from the most similar program. Naturally, that program
| + | |
| - | can have flags that improve not only execution time, but also code
| + | |
| - | size and compilation time among other parameters. Hence a user can
| + | |
| - | suggest an order of sorting speedups by:
| + | |
| - | 0 - execution time
| + | |
| - | 1 - code size,
| + | |
| - | 2 - compilation time
| + | |
| - | before returning the top optimization. For example, when setting this variable to
| + | |
| - | 012 - cTuning returns the optimization case with the highest execution time
| + | |
| - | and only then sorts them by code size improvement and compilation time speedup;
| + | |
| - | 102 - cTuning returns the optimization case with the highest code size improvement,
| + | |
| - | then execution time speedup and then compilation time;
| + | |
| - | 201 - cTuning returns the optimization case with the highest compilation time speedup,
| + | |
| - | then execution time speedup and only then code size.
| + | |
| - | | + | |
| - | CT_OPT_REPORT=1 - when set to 1, cTuning returns all optimization cases sorted according to SORT
| + | |
| - | environment variable together with the associated optimization ID so that user
| + | |
| - | could later force different optimization case, particularly when having multi-objective
| + | |
| - | optimization scenarios.
| + | |
| - | | + | |
| - | Here is an example of such output:
| + | |
| - | | + | |
| - | ****************************************************************************
| + | |
| - | MILEPOST GCC V1.5 (wrapper for GCC to communicate with cTuning web services)
| + | |
| - | <BR>
| + | |
| - | Invoking collective tuning and machine learning mode ...
| + | |
| - | <BR>
| + | |
| - | Extracting program static features (-O1) ...
| + | |
| - | <BR>
| + | |
| - | Aggregating features ...
| + | |
| - | <BR>
| + | |
| - | Static program features:
| + | |
| - | ft1=9, ft2=2, ft3=1, ft4=0, ft5=4, ft6=1, ft7=0, ft8=2, ft9=1, ft10=0, ft11=0,
| + | |
| - | ft12=0, ft13=5, ft14=0, ft15=0, ft16=8, ft17=0, ft18=0, ft24=27, ft25=13.50,
| + | |
| - | ft19=0, ft39=0, ft20=1, ft21=0, ft33=0, ft21=24, ft35=2, ft22=11, ft23=0, ft34=6,
| + | |
| - | ft36=3, ft37=0, ft38=0, ft40=0, ft41=8, ft42=0, ft43=0, ft44=0, ft45=0, ft46=1,
| + | |
| - | ft48=3, ft47=9, ft49=0, ft51=0, ft50=55, ft52=21, ft53=0, ft54=2, ft55=0, ft26=0,
| + | |
| - | ft27=0, ft28=0, ft29=0, ft30=5, ft31=0, ft32=0
| + | |
| - | <BR>
| + | |
| - | Submitting features to the cTuning web-service to predict good optimizations ...
| + | |
| - | <BR>
| + | |
| - | cTuning Optimization Report (optimal optimization cases):
| + | |
| - | <BR>
| + | |
| - | Distance from most close program (462.libquantum) = 0.639
| + | |
| - | <BR>
| + | |
| - | Selected opt. case = 23011215880571251
| + | |
| - | <BR>
| + | |
| - | Optimal cases on frontier (averaged speedups):
| + | |
| - | Ex.time: Code size: Comp. time: cTuning opt. case:
| + | |
| - | <BR>
| + | |
| - | 1.18 0.80 1.00 15423655473087225
| + | |
| - | 1.21 0.80 0.80 29686176401405
| + | |
| - | 1.25 0.70 0.80 4614589283098526
| + | |
| - | 1.29 0.67 0.80 23011215880571251
| + | |
| - | 1.25 0.70 0.80 15721270875126789
| + | |
| - | 1.26 0.69 0.80 15128754576807000
| + | |
| - | 1.29 0.67 1.00 19230939973657069
| + | |
| - | 1.07 1.02 1.00 3258730975700728
| + | |
| - | 1.21 0.80 1.00 23810155474721838
| + | |
| - | 1.24 0.71 1.00 4699569679776380
| + | |
| - | 1.26 0.68 0.83 15492934568598271
| + | |
| - | <BR>
| + | |
| - | Predicted flags:
| + | |
| - | -O2 -fdelete-null-pointer-checks -fno-tree-pre -funroll-all-loops
| + | |
| - | <BR>
| + | |
| - | Invoking command:
| + | |
| - | gcc -O2 -fdelete-null-pointer-checks -fno-tree-pre -funroll-all-loops
| + | |
| - | bitarray.c bitcnt_1.c bitcnt_2.c bitcnt_3.c bitcnt_4.c
| + | |
| - | bitcnts.c bitfiles.c bitstrng.c bstr_i.c loop-wrap.c
| + | |
| - | ****************************************************************************
| + | |
| - | | + | |
| - | Multi-objective optimizations:
| + | |
| - | When there are many optimization cases that improve at the same time execution time, code size
| + | |
| - | and compilation time, the selection of an optimal optimization case depends on depends on end-user
| + | |
| - | usage scenarios: improving both execution time and code size is often required for embedded applications,
| + | |
| - | improving both compilation and execution time is important for data centers and real-time systems,
| + | |
| - | while improving only execution time is common for desktops and supercomputers. Hence, we provided several
| + | |
| - | other environment variables to select optimization cases on the frontier of the optimization space:
| + | |
| - | | + | |
| - | DIM=012 - returns optimization cases only on the frontier of all optimization cases.
| + | |
| - | For example DIM=01 produces 2D frontier for execution time speedup and code size improvement,
| + | |
| - | DIM=02 produces 2D frontier for execution time and compilation time speedups,
| + | |
| - | DIM=12 produces 2D frontier for code size improvement and compilation time speedup,
| + | |
| - | DIM=012 produces 3D frontier for all constraints.
| + | |
| - | | + | |
| - | CUT=0,0,0 - cuts optimization cases frontier on each dimension, i.e. if CUT=0,0,1.2
| + | |
| - | the frontier optimization cases should have compilation time speedup > 1.2,
| + | |
| - | if CUT=1,1,1, all optimization cases on frontier should have execution time
| + | |
| - | speedup > 1, code size improvement > 1 and compilation time > 1.
| + | |
| - | | + | |
| - | When using this mode with DIM=012 and CUT=1,1,1, only one optimization case will be returned
| + | |
| - | (when using CT_OPT_REPORT=1):<BR><BR> 1.07 1.02 1.00 3258730975700728<BR><BR>
| + | |
| - | Note, that you have to select such cases manually, because MILEPOST GCC will still use
| + | |
| - | the top optimization case before building frontier since the last one really depend on
| + | |
| - | user scenario.
| + | |
| - | | + | |
| - | The following info is very important to find optimization cases from similar program
| + | |
| - | for the following architecture (you can most similar architecture to yours at
| + | |
| - | with optimization case at http://cTuning.org/cdatabase)
| + | |
| - | | + | |
| - | CCC_PLATFORM_ID=2111574609159278179 (example for AMD Athlon 64 3700+)
| + | |
| - | CCC_ENVIRONMENT_ID=2781195477254972989 (example for Linux Mandriva 2.6.17-10alchemy)
| + | |
| - | CCC_COMPILER_ID=331350613878705696 (example for GCC 4.4.0)
| + | |
| - | | + | |
| - | When compiling large applications, feature extraction can take a very long time
| + | |
| - | (and this is part of the future work to speed it up), so a user may want to
| + | |
| - | extract features only of a few functions. In this case, a user should add
| + | |
| - | the file _ctuning_select_functions.txt to the compilation directory where
| + | |
| - | only those functions should be listed that need to be processed
| + | |
| - | (one function per line).
| + | |
| - | | + | |
| - | * If you want to test low-level plugins, you can find self-explanatory tests in plugins-ici-2.0/tests directory.
| + | |
| - | | + | |
| - | === Usage ===
| + | |
| - | | + | |
| - | * MILEPOST GCC / cTuning web-services test:
| + | |
| - | | + | |
| - | milepost-gcc --ct-test *.c
| + | |
| - | | + | |
| - | You can also use test script ./__test_ctuning_web_service_for_milepost_gcc
| + | |
| - | | + | |
| - | * Using optimization cases directly from the Collective Optimization Database (referenced by unique ID) - it is useful for multi-objective optimization, to share optimization cases within the community or when publishing papers and results on program optimization:
| + | |
| - | | + | |
| - | milepost-gcc --ct-opt=11475790782770590 *.c
| + | |
| - | | + | |
| - | You can also use demo script ./__compile_using_milepost_gcc_with_fixed_optimization to understand how to configure your own system.
| + | |
| - | | + | |
| - | * Predict good optimizations (execution time, code size, compilation time) based on correlation of program features and optimizations using collective optimization knowledge (empirical iterative feedback-directed compilation performed by multiple users and shared in the Collective Optimization Database):
| + | |
| - | | + | |
| - | milepost-gcc -Oml *.c
| + | |
| - | | + | |
| - | You can also use demo script ./__compile_using_milepost_gcc_with_prediction_optimization
| + | |
| - | to understand how to configure your own system.
| + | |
| - | | + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |
| - | <BR>
| + | |