From cTuning.org

(Difference between revisions)
Jump to: navigation, search
Current revision (22:46, 16 March 2010) (view source)
 
(7 intermediate revisions not shown.)
Line 3: Line 3:
{{CMenu:CTools|MilepostGCC}}
{{CMenu:CTools|MilepostGCC}}
-
= MILEPOST 1.5 GCC 4.4.0 =
+
* [[CTools:MilepostGCC:Documentation:MILEPOST_V2.1|MILEPOST GCC V2.1 (GCC 4.4.0, 4.4.1, 4.4.2, 4.4.3)]]
-
 
+
* MILEPOST GCC V1.5 & V2.0 - unreleased, internal versions
-
=== License ===
+
* [[CTools:MilepostGCC:Documentation:MILEPOST_V1.0|MILEPOST GCC V1.0 (GCC 4.4.0)]]
-
 
+
-
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
+
-
 
+
-
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the [http://www.gnu.org/copyleft/gpl.html GNU General Public License for more details]. 
+
-
 
+
-
If you found this software useful, you are welcome to reference http://cTuning.org website and these publications {{Ref|FMTP2008}},{{Ref|Fur2009}},{{Ref|FT2009}} in your derivative works.
+
-
 
+
-
=== Authors ===
+
-
 
+
-
* [http://fursin.net/research Grigori Fursin] (INRIA, France) - original design of the MILEPOST/ICI/cTuning framework
+
-
* Mircea Namolaru (IBM Research Lab, Israel) - feature extractor pass
+
-
 
+
-
=== Framework high-level overview ===
+
-
 
+
-
<div align="left">http://ctuning.org/wiki/images/img-milepost-gcc-structure1.gif</div>
+
-
 
+
-
=== History ===
+
-
MILEPOST GCC V1.5 (4.4.0) - TBA - Fully updated compiler that includes
+
-
                                        parts of CCC framework and can communicate
+
-
                                        with cTuning web-services to predict good optimization
+
-
                                        cases to improve execution time, code size and compilation time
+
-
                                        using correlation between program features and optimizations.
+
-
 
+
-
MILEPOST GCC 4.4.0 - 20090629 - New official version of MILEPOST GCC with new ICI v2.0
+
-
                                and updated static feature extractor.
+
-
 
+
-
MILEPOST GCC 4.2.2 - 20080613 - Stable MILEPOST GCC version used in most MILEPOST Year 3 experiments.
+
-
 
+
-
=== Requirements ===
+
-
 
+
-
In order to install MILEPOST GCC, you will need:
+
-
 
+
-
* C compiler that can compile [http://gcc.gnu.org GCC 4.x].
+
-
* uuid or uudigen tool to generate unique identifiers.
+
-
* [http://www.php.net PHP] ''(needed to communicate with cTuning web-services)''.
+
-
 
+
-
=== Directory structure ===
+
-
 
+
-
gcc-4.4.0                      - MILEPOST GCC 4.4.0 source directory (core + gfortran)
+
-
 
+
-
ccc-framework                  - MILEPOST V1.5 wrapper and necessary tools to communicate
+
-
                                  with cTuning web-services (standard part of CCC framework)
+
-
 
+
-
src-third-party                - Third party support tools
+
-
  |   
+
-
  +-- gmp-4.3.0                  - GMP library
+
-
  +-- mpfr-2.4.1                - MPFR library
+
-
  +-- ppl-0.10.2                - PPL library (for GRAPHITE)
+
-
  +-- cloog                      - CLOOG library (for GRAPHITE)
+
-
  +-- XSB                        - Prolog for machine learning tools (MILEPOST, UNIDAPT, cTuning)
+
-
 
+
-
plugins-ici-2.0                - Plugins for GCC 4.4.0 ICI 2.0 (see README inside this directory)
+
-
 
+
-
demo                            - Demo files for MILEPOST GCC V1.5
+
-
 
+
-
install                        - Directory with installed binaries
+
-
 
+
-
=== Installation ===
+
-
 
+
-
First, check in all scripts that you have the same BUILD_EXT variable
+
-
that points to the install directory! You may have different names
+
-
if you install MILEPOST GCC for several architectures on the shared
+
-
file system ...
+
-
 
+
-
Invoke:
+
-
./build_gcc.sh to build GCC with all the third-party tools.
+
-
./build_ccc.sh to build CCC framework with MILEPOST GCC wrapper.
+
-
 
+
-
./build_plugins.sh will build all non-machine learning plugins.
+
-
./build_plugins_ml.sh will build all machine learning plugins.
+
-
 
+
-
=== General configuration ===
+
-
 
+
-
Check ./_set_environment_for_milepost_gcc.sh - normally all environment
+
-
variables should be already properly set (check variable CCC_UUID -
+
-
the uuid tool). You have to source this file before using MILEPOST GCC .
+
-
 
+
-
File ./_set_environment_for_milepost_gcc.sh sets up environment
+
-
variables for low-level ICI tests and should also be already properly
+
-
set. If you plan to use only high-level MILEPOST GCC, you can skip it.
+
-
 
+
-
=== Configuration for demos ===
+
-
 
+
-
* You can find how to use MILEPOST GCC using bitcount benchmark in the demo directory: /demo/bitcount.
+
-
 
+
-
    You need to first configure environment variables in the
+
-
    ___common_environment.sh which are user-dependent:
+
-
 
+
-
    CCC_CTS_USER and CCC_CTS_PASS should be set to your username and password when
+
-
    self-registering at http://ctuning.org/wiki/index.php/Special:UserLogin
+
-
 
+
-
    NOW YOU CAN TEST MILEPOST GCC wrapper and communication with the cTuning database
+
-
    by invoking __test_milepost_gcc.sh. If everything is installed correctly, you
+
-
    should get a response from the cTuning web-service: "Test passed successfully".
+
-
 
+
-
    In order to continue using MILEPOST GCC, you can check the following variables:
+
-
    Note that they already have default parameters so you do not have to change that
+
-
    unless you want to tune MILEPOST GCC:
+
-
 
+
-
    CCC_CTS_URL=cTuning.org/wiki/index.php/Special:CDatabase?request=
+
-
                - points to the cTuning web-service.
+
-
 
+
-
    CCC_CTS_DB=cod_opt_cases - points to the database with optimization cases
+
-
              from the community.
+
-
 
+
-
    ICI_PLUGIN_VERBOSE=1 - if set to 1, additional diagnostic information from ICI plugins.
+
-
    ICI_VERBOSE=1 - if set to 1, additional diagnostic information from ICI.
+
-
 
+
-
 
+
-
    ICI_PROG_FEAT_PASS=fre - sets pass after which to extract static program features.
+
-
 
+
-
    CCC_COMPILER_FEATURES_ID=129504539516446542 - sets compiler ID which was used
+
-
                            to extract static program features for all programs
+
-
                            at cTuning.org. Do not changed it unless you really
+
-
                            understand what you are doing ;) !..
+
-
 
+
-
    CCC_OPTS="-O3" - sets combination of flags to be used if cTuning prediction web-service
+
-
                    did not return optimization flags.
+
-
 
+
-
    CCC_OPT_ARCH_USE=1 - if set to 1, MILEPOST GCC will also use architecture-dependent flags
+
-
                        (such as -march=athlon64) from cTuning.org. If set to 0, architecture
+
-
                        dependent flags will be ignored.
+
-
 
+
-
    TIME_THRESHOLD=0.3 - when calculating speedups at cTuning.org, only optimization cases
+
-
                        with EXECUTION TIME more than this threshold are considered.
+
-
 
+
-
    NOTES= - when <>"", only those optimization cases are returned that have this NOTES.
+
-
 
+
-
    PG_USE=0 - if set to 1, only those optimization cases are returned that have function and other
+
-
              level profiling. If unset or set to 0, use only those cases that do not have profiling
+
-
              to avoid speedup skewing due to profiling.
+
-
 
+
-
    OUTPUT_CORRECT=1 - if set to 1, only those optimization cases are returned that have been
+
-
                      checked for correctness by comparing benchmark outputs for the original
+
-
                      and transformed program (note that it still does not guarantee that
+
-
                      the combination of optimizations is correct, but it helps to reduce
+
-
                      obvious wrong cases).
+
-
 
+
-
    RUN_TIME=RUN_TIME - sets which execution time to use when calculating speedups
+
-
                        (RUN_TIME - overall program execution time,
+
-
                        while RUN_TIME USER - only user execution time)
+
-
 
+
-
    SORT=012 - when predicting optimizations, the best combinations of optimizations
+
-
              are selected from the most similar program. Naturally, that program
+
-
              can have flags that improve not only execution time, but also code
+
-
              size and compilation time among other parameters. Hence a user can
+
-
              suggest an order of sorting speedups by:
+
-
                0 - execution time
+
-
                1 - code size,
+
-
                2 - compilation time
+
-
              before returning the top optimization. For example, when setting this variable to
+
-
              012 - cTuning returns the optimization case with the highest execution time
+
-
              and only then sorts them by code size improvement and compilation time speedup;
+
-
              102 - cTuning returns the optimization case with the highest code size improvement,
+
-
              then execution time speedup and then compilation time;
+
-
              201 - cTuning returns the optimization case with the highest compilation time speedup,
+
-
              then execution time speedup and only then code size.
+
-
 
+
-
    CT_OPT_REPORT=1 - when set to 1, cTuning returns all optimization cases sorted according to SORT
+
-
                      environment variable together with the associated optimization ID so that user
+
-
                      could later force different optimization case, particularly when having multi-objective
+
-
                      optimization scenarios.
+
-
 
+
-
                      Here is an example of such output:
+
-
 
+
-
                        ****************************************************************************
+
-
                        MILEPOST GCC V1.5 (wrapper for GCC to communicate with cTuning web services)
+
-
                        <BR>
+
-
                        Invoking collective tuning and machine learning mode ...
+
-
                        <BR>
+
-
                        Extracting program static features (-O1) ...
+
-
                        <BR>
+
-
                        Aggregating features ...
+
-
                        <BR>
+
-
                        Static program features:
+
-
                        ft1=9, ft2=2, ft3=1, ft4=0, ft5=4, ft6=1, ft7=0, ft8=2, ft9=1, ft10=0, ft11=0,
+
-
                        ft12=0, ft13=5, ft14=0, ft15=0, ft16=8, ft17=0, ft18=0, ft24=27, ft25=13.50,
+
-
                        ft19=0, ft39=0, ft20=1, ft21=0, ft33=0, ft21=24, ft35=2, ft22=11, ft23=0, ft34=6,
+
-
                        ft36=3, ft37=0, ft38=0, ft40=0, ft41=8, ft42=0, ft43=0, ft44=0, ft45=0, ft46=1,
+
-
                        ft48=3, ft47=9, ft49=0, ft51=0, ft50=55, ft52=21, ft53=0, ft54=2, ft55=0, ft26=0,
+
-
                        ft27=0, ft28=0, ft29=0, ft30=5, ft31=0, ft32=0
+
-
                        <BR>
+
-
                        Submitting features to the cTuning web-service to predict good optimizations ...
+
-
                        <BR>
+
-
                        cTuning Optimization Report (optimal optimization cases):
+
-
                        <BR>
+
-
                        Distance from most close program (462.libquantum) = 0.639
+
-
                        <BR>
+
-
                        Selected opt. case = 23011215880571251
+
-
                        <BR>
+
-
                        Optimal cases on frontier (averaged speedups):
+
-
                        Ex.time:  Code size:  Comp. time:        cTuning opt. case:
+
-
                        <BR>
+
-
                            1.18        0.80          1.00        15423655473087225
+
-
                            1.21        0.80          0.80            29686176401405
+
-
                            1.25        0.70          0.80          4614589283098526
+
-
                            1.29        0.67          0.80        23011215880571251
+
-
                            1.25        0.70          0.80        15721270875126789
+
-
                            1.26        0.69          0.80        15128754576807000
+
-
                            1.29        0.67          1.00        19230939973657069
+
-
                            1.07        1.02          1.00          3258730975700728
+
-
                            1.21        0.80          1.00        23810155474721838
+
-
                            1.24        0.71          1.00          4699569679776380
+
-
                            1.26        0.68          0.83        15492934568598271
+
-
                        <BR>
+
-
                        Predicted flags:
+
-
                        -O2 -fdelete-null-pointer-checks -fno-tree-pre -funroll-all-loops
+
-
                        <BR>
+
-
                        Invoking command:
+
-
                        gcc -O2 -fdelete-null-pointer-checks -fno-tree-pre -funroll-all-loops 
+
-
                                bitarray.c bitcnt_1.c bitcnt_2.c bitcnt_3.c bitcnt_4.c
+
-
                                bitcnts.c bitfiles.c bitstrng.c bstr_i.c loop-wrap.c
+
-
                        ****************************************************************************
+
-
 
+
-
    Multi-objective optimizations:
+
-
    When there are many optimization cases that improve at the same time execution time, code size
+
-
    and compilation time, the selection of an optimal optimization case depends on depends on end-user
+
-
    usage scenarios: improving both execution time and code size is often required for embedded applications,
+
-
    improving both compilation and execution time is important for data centers and real-time systems,
+
-
    while improving only execution time is common for desktops and supercomputers. Hence, we provided several
+
-
    other environment variables to select optimization cases on the frontier of the optimization space:
+
-
 
+
-
    DIM=012 - returns optimization cases only on the frontier of all optimization cases.
+
-
              For example DIM=01 produces 2D frontier for execution time speedup and code size improvement,
+
-
              DIM=02 produces 2D frontier for execution time and compilation time speedups,
+
-
              DIM=12 produces 2D frontier for code size improvement and compilation time speedup,
+
-
              DIM=012 produces 3D frontier for all constraints.
+
-
 
+
-
    CUT=0,0,0 - cuts optimization cases frontier on each dimension, i.e. if CUT=0,0,1.2
+
-
                the frontier optimization cases should have compilation time speedup > 1.2,
+
-
                if CUT=1,1,1, all optimization cases on frontier should have execution time
+
-
                speedup > 1, code size improvement > 1 and compilation time > 1.
+
-
 
+
-
    When using this mode with DIM=012 and CUT=1,1,1, only one optimization case will be returned
+
-
    (when using CT_OPT_REPORT=1):<BR><BR>                            1.07        1.02          1.00          3258730975700728<BR><BR>
+
-
    Note, that you have to select such cases manually, because MILEPOST GCC will still use
+
-
    the top optimization case before building frontier since the last one really depend on
+
-
    user scenario.
+
-
 
+
-
    The following info is very important to find optimization cases from similar program
+
-
    for the following architecture (you can most similar architecture to yours at
+
-
    with optimization case at http://cTuning.org/cdatabase)
+
-
 
+
-
    CCC_PLATFORM_ID=2111574609159278179    (example for AMD Athlon 64 3700+)
+
-
    CCC_ENVIRONMENT_ID=2781195477254972989 (example for Linux Mandriva 2.6.17-10alchemy)
+
-
    CCC_COMPILER_ID=331350613878705696    (example for GCC 4.4.0)
+
-
 
+
-
    When compiling large applications, feature extraction can take a very long time
+
-
    (and this is part of the future work to speed it up), so a user may want to
+
-
    extract features only of a few functions. In this case, a user should add
+
-
    the file _ctuning_select_functions.txt to the compilation directory where
+
-
    only those functions should be listed that need to be processed
+
-
    (one function per line).
+
-
 
+
-
* If you want to test low-level plugins, you can find self-explanatory tests in plugins-ici-2.0/tests directory.
+
-
 
+
-
=== Usage ===
+
-
 
+
-
* MILEPOST GCC / cTuning web-services test:
+
-
 
+
-
    milepost-gcc --ct-test *.c
+
-
 
+
-
    You can also use test script ./__test_milepost_gcc
+
-
 
+
-
* Using optimization cases directly from the Collective Optimization Database (referenced by unique ID) - it is useful for multi-objective optimization, to share optimization cases within the community or when publishing papers and results on program optimization:
+
-
 
+
-
    milepost-gcc --ct-opt=11475790782770590 *.c
+
-
 
+
-
    You can also use demo script ./__compile_using_milepost_gcc_with_fixed_optimization to understand how to configure your own system.
+
-
 
+
-
* Predict good optimizations (execution time, code size, compilation time) based on correlation of program features and optimizations using collective optimization knowledge (empirical iterative feedback-directed compilation performed by multiple users and shared in the Collective Optimization Database):
+
-
 
+
-
    milepost-gcc -Oml *.c
+
-
 
+
-
    You can also use demo script ./__compile_using_milepost_gcc_with_prediction_optimization
+
-
    to understand how to configure your own system.
+
-
 
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+
-
<BR>
+

Current revision

MILEPOST GCC documentation

Navigation: cTuning.org > CTools > MilepostGCC
Locations of visitors to this page