CTools:CCC:Documentation:CCC V2.0x

From cTuning.org

Navigation: cTuning.org > CTools > CCC

CCC V2.0x documentation

This is the first draft of the documentation. It should be updated. Any help is appreciated.

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

If you found this software useful, you are welcome (but not obliged) to reference http://ctuning.org website and these publications Fur2009, FT2009,FMTP2008 in your derivative works.

Framework high-level overview

Requirements

To successfully install CCC Framework, you will need:

C compiler (tested with GCC).
uuid or uudigen tool to generate unique identifiers.

MySQL client and development headers /include/mysql/* (not strictly required but functionality will be reduced).
PHP (not strictly required but functionality will be reduced - used for iterative compilation and analysis plugins).
PAPI library (not strictly required - used in auxiliary tools).
OProfile (not strictly required - needed for transparent profiling).

Directory structure

/
- README.txt - brief README and GPL license info
- INSTALL.sh - main installation script that calls ccc-configure* scripts one by one
- ccc-configure* - distributed configuration scripts for different tasks (See Installation section)
- ccc-build.cfg - CCC Framework build number
- ccc-build-db.cfg - lower and upper version number of the Collective Optimization Database which this version is intended to work with (to avoid incompatibility when both CCC Framework and COD are evolving).

/src-plat-dep - platform dependent tools and plugins
- /include/ccc - CCC Framework header files
- /lib - auxiliary functions
- /plugins - plugins sources
  - /compilation - iterative feedback-directed compilation plugins
  - /ml-prediction - machine learning prediction plugins
- /tools - low-level tools
  - /ccc-time - substitution for standard time program to collect different profile info about program including hardware counters support
  - /ccc-comp - basic tool dealing with program compilation
  - /ccc-run - basic tool dealing with program execution and profiling collection
  - /ccc-db-send-stats-comp - send compilation statistics to COD
  - /ccc-db-send-stats-comp-passes - send info about compiler optimizations (at function-level) to COD
  - /ccc-db-send-stats-prog-feat - send info about program static features (machine learning) to COD
  - /ccc-db-send-stats-run - send execution statistics to COD
  - /milepost-gcc - MILEPOST GCC wrapper to support -ml-e, -ml-c, -ml machine learning optimization flags that invoke ML ICI plugins to extract static program features and query COD web-services to predict good optimizations to improve execution time, code size or both respectively.
- /tools-aux - auxiliary tools if system supports that
  - /hardware-counters-papi - collecting hardware counters statistics (dynamic program features for machine learning or statistical analysis)

/src-plat-indep - platform independent tools and plugins
- /include/ccc_script_functions.php - library that supports platform independent plugins and deals mostly with COD
- /plugins - plugins and scripts

/cfg - CCC configuration directories for different architectures/environments/compilers
- /default - default configurations including optimizations for several compilers (GCC,Open64,PathScale)

/install - installation directory for (platform-dependent) tools and scripts

/apps - applications converted to work with CCC and scripts to automate iterative compilation

Installation

Installation is performed using INSTALL.sh script for each platform (architecture/environment) on which iterative compilation experiments will be performed. This script calls individual ccc-configure-* scripts to configure the following:

ccc-configure-platform-name.sh - Select local name of the platform (the directory with this name will be created in cfg and install directories).

ccc-configure-uuid.sh - Select uuid generator (uuid, uuidgen, etc) to generate unique ID for all data.

ccc-configure-database.sh - Configure Collective Optimization Database access if it is used for experiments. If it is not used, all compilation and execution statistics is recorded in local files and can later be send to database in case you would like to share your optimization cases.

ccc-configure-database-test.sh - Test COD access using parameters from the previous step.

ccc-configure-platform.sh - Provide architecture info for experiments. You can view if similar already exists (view) and use its unique ID or add new architecture info directly (add). Each architecture has its unique ID to be able to share optimization cases.

ccc-configure-environment.sh - Provide environment info for experiments. You can view if similar already exists (view) and use its unique ID or add new environment info directly (add). Each environment has its unique ID to be able to share optimization cases.

ccc-configure-compiler.sh - Provide compiler info for experiments. You can view if similar already exists (view) and use its unique ID or add new compiler info directly (add). Each compiler has its unique ID to be able to share optimization cases.

During compiler installation, user can configure compiler paths for binaries, libraries, plugins, etc. This information is recorded in the cfg/<platform name>/ccc-env.c.<short compiler name> script and is invoked during compilation (ccc-comp). This allows multiple versions of the same compiler co-exist on the system. Information about all compilers with their unique IDs and short names is recorded in the cfg/<platform_name>/ccc-compilers.cfg file.

ccc-configure-runtime-environment.sh - Provide runtime environment info for experiments (such as VM, architecture simulator, etc ). You can view if similar already exits (view) and use its unique ID or add new compiler info directly (add). Each runtime environment has its unique ID to be able to share optimization cases.

During runtime environment installation, user can configure compiler paths for binaries, libraries, plugins, etc. This information is recorded in the cfg/<platform name>/ccc-env.re.<short runtime environment name> script and is invoked during compilation (ccc-run). This allows multiple versions of the same runtime environments co-exist on the system. Information about all runtime environments with their unique IDs and short names is recorded in the cfg/<platform_name>/ccc-re.cfg file.

ccc-configure-compile-all-tools.sh - Compile all low-level tools

ccc-configure-compile-all-plugins.sh - Compile and configure all plugins

ccc-configure-compile-all-tools-aux.sh - Compile all auxiliary tools if platform supports them

ccc-configure-update.sh - Check for update

ccc-configure-set-environment.sh - Set environment based on the information entered in all previous steps. The environment files ccc-env.sh and ccc-env.csh for your platform will be created in the directory cfg/<platform_name>/. You can edit them to correct paths to specific compilers such as GCC with ICI, MILEPOST GCC, LLVM, Open64, PathScale, Testarossa, Intel, etc.

This script has to be invoked for a given platform before performing any experiments. The ccc-configure* scripts can later be invoked individually if needed.

Compiler optimization file format

Compiler optimization files (ccc-glob-flags.<local compiler name>.cfg) are located in cfg directory and have the following format:

First parameter is the optimization type:

1 - optimization flag that takes parameter
1, <start_parameter>, <end_parameter>, flag
2 - optimization flag is on or off
2, flag
3 - select on flag from a list of flags
3, number of flags in a list, flags separated by comma

Example from GCC:

1, 0, 3, -O
1, 1, 64, -fsched-stalled-insns-dep=
2, -m32
2, -m3dnow
3, 2, -fbranch-count-reg, -fno-branch-count-reg
3, 2, -fbranch-target-load-optimize, -fno-branch-target-load-optimize
3, 2, -fbtr-bb-exclusive, -fno-btr-bb-exclusive

We are extending framework to handle optimization passes and fine-grain optimization similar to outdated FCO framework.

Collective Optimization Database

COD has been recently separated into 2 parts: common and experimental. The common database keeps information about architectures, environments, compilers, programs, compiler flags, architecture configurations, etc - the information that can be common for many users. The experimental databases keep local information about optimization cases after iterative feedback-directed compilation. They can have user-sensitive information and hence should not always be shared. User can later select interesting optimization cases to share.

Some more info about COD web-services/API is here.
COD structure

Applications

CCC Framework is intended to automate a large number of iterative compilation experiments. Application has to be slightly modified to work with CCC. For example, Collective Benchmark is already prepared to be used directly with the latest CCC Framework. Here you can find more info about benchmark format. Eventually, we plan to add full support to enable application optimizations transparently without any Makefile modifications.

Low-level tools

There are 3 main low-level tools that abstract platform from iterative compilation experiments:

ccc-time

Command line: ccc-time -fe <name_of_executable> -fp <command_line_for_executable> -ft <file to save time>

This program substitutes native time to profile program and potentially support different architectural features such as hardware counters, etc. Normally, if users work with ported to CCC Framework applications such as CBench, they will manipulate only with ccc-comp and ccc-run or high-level plugins and will not use this tool directly.

ccc-comp

Command line: ccc-comp <compiler extension> "Compiler optimization flags" "Additional flags that should not be recorded (not optimization related)"

This tool source cfg/<platform name>/ccc-env.c.<compiler extension> script with compiler paths, invokes Makefile associated with the compiler extension and compiles program with the specified optimization flags. Then it invokes scripts to send statistics to COD. ccc-comp is controlled by multiple environment variables that are described in the following section.

ccc-run

Command line: ccc-run <Dataset> <Base_line_run_param>

If the CCC_RE environment variable is set, this tool first source cfg/<platform name>/ccc-env.re.$CCC_RE script with runtime environment paths. Then it executes application with a given dataset number. If it's the first baseline run (to be able to compare execution time, code size and compilation time improvements and compare output for correctness with the consecutive iterative feedback-directed runs), the base_line_run_param should be set to 1, otherwise to 0. ccc-run is controlled by multiple environment variables that are described in the following section.

Iterative feedback-directed compilation example

Directory with applications apps has one test directory CCC-TEST-APP. You can download the whole CBench and datasets using ccc-admin--get-cbench-from-svn.sh and ccc-admin--get-cbench-datasets-from-svn.sh.

The list of all benchmarks set up for 1 dataset is in file ccc--bench-list.dataset1.txt. The list of all benchmarks with all datasets is in file ccc--bench-list.dataset_all.txt. One of those files should be copied into ccc--bench-list.txt that is the working file with the list of benchmarks to be processed automatically by scripts. If you want to use test benchmark, just leave it in the ccc--bench-list.txt.

Before performing any experiments you should create temporal source directories using ccc-admin--create-work-dirs.sh that copies all src directories to src-tmp directories. Those directories can later be deleted using ccc-admin--delete-work-dirs.sh script.

You can then invoke the test compile/run script ccc-test--compile-run.sh in one of the tmp directories. This script will compile application with -O3 flag and make a base line run, and then compile program with -O2 flags and make a experimental run. This script shows environmental variables that influence ccc-comp and ccc-run:

#!/bin/bash

# Copyright (C) 2004-2009 by Grigori G.Fursin
#
# http://fursin.net/research
# 
# UNIDAPT Group
# http://unidapt.org

##############################################################
#Record compiler passes (through ICI)
#export CCC_ICI_PASSES_RECORD=1

#Load compiler passes from files or environment (through ICI)
#export CCC_ICI_PASSES_USE=1
#export CCC_ICI_PASSES_OPT_BASE=-O3
#export ICI_PASSES_ALL=...

#Produce verbose output from the ICI plugins
#export ICI_PLUGIN_VERBOSE=1
#export ICI_VERBOSE=1

#Extract program static features (through ICI)
#export CCC_ICI_FEATURES_STATIC_EXTRACT=1
#export ICI_PROG_FEAT_PASS=fre

#Record run-time background info when working in realistic environments 
#to know how other applications interfere with optimizations
#export CCC_RUN_TIME_BACKGROUND="matmul 16Mb array, etc"

#Profile application using hardware counters and PAPI library
#export CCC_HC_PAPI_USE=$CCC_HC_PAPI_LIST
#export CCC_HC_PAPI_USE=PAPI_L1_DCMx,PAPI_L2_DCMx,PAPI_TLB_DMx,PAPI_L1_LDMx,PAPI_L1_STMx,PAPI_L2_LDMx,PAPI_L2_STMx,PAPI_BR_TKNx,PAPI_BR_MSPx,PAPI_TOT_INSx,PAPI_FP_INSx,PAPI_BR_INSx,PAPI_VEC_INSx,PAPI_TOT_CYCx,PAPI_L1_DCHx,PAPI_FP_OPSx

#Profile application using gprof
#export CCC_GPROF=1

#Profile application using oprof
#export CCC_OPROF=1
#export CCC_OPROF_PARAM="--event=CPU_CLK_UNHALTED:6000"

#Perform compilation only (no run).
#export CCC_NO_RUN=1

#Repeat execution a number of times with the same dataset to check execution time variation on the system.
export CCC_RUNS=1

#Use time-run to kill application if it runs for too long
#The reason is that during interative compilation some produced binaries
#are corrupt and have infinite loops.
export CCC_TIMED_RUN="timed-run 3000"

#Architecture specific optimization flags
#export CCC_OPT_PLATFORM="-mA7 -ffixed-r12 -ffixed-r16 -ffixed-r17 -ffixed-r18 -ffixed-r19 -ffixed-r20 -ffixed-r21 -ffixed-r22 -ffixed-r23 -ffixed-r24 -ffixed-r25"
#export CCC_OPT_PLATFORM="-mA7"
#export CCC_OPT_PLATFORM="-mtune=itanium2"
#export CCC_OPT_PLATFORM="-march=athlon64"

#Some compilation info that should be standardized and automated 
#(if you use ARCH_CFG and/or ARCH_SIZE, you should set CCC_OPT_PLATFORM to "" or other platform related flag
export CCC_OPT_PLATFORM="-msse2"
#export CCC_ARCH_CFG="l1_cache=203; l2_cache=35;"
#export CCC_ARCH_SIZE=132

#Some compilation info that should be standardized and automated
#export CCC_OPT_FINE="loop_tiling=10;"
#export CCC_OPT_PAR_STATIC="all_loops=parallelizable;"

#Some run-time info that eventually should be standardized and automated
#export CCC_RUN_POWER=10
#export CCC_RUN_ENERGY=20
#export CCC_PAR_DYNAMIC="no deps"

#HERE YOU CAN SUBSTITUTE PLATFORM/ENVIRONMENT IDS IF YOU WANT TO DO CROSS-COMPILATION/ANALYSIS
#export CCC_PLATFORM_ID=
#export CCC_ENVIRONMENT_ID=

#Select which processor to run application on, in case of multiprocessor system
#export CCC_PROCESSOR_NUM=

#Select runtime environment (VM or simulator)
#export CCC_RUN_RE=llvm25

#For SPEC2006 and ICI ...
export ICI_WORK_DIR=$PWD/

#Baseline run
#export CCC_NOTES="baseline compilation"
ccc-comp gcc422 -O3
#export CCC_NOTES="baseline run"
ccc-run 1 1

#Optimization run
#export CCC_NOTES="opt compilation"
ccc-comp gcc422 -O2
#export CCC_NOTES="opt run"
ccc-run 1 0

Finally, for automatic iterative compilation experiments, you can use ccc-run--glob-flags.sh script that has several modes how to select benchmarks and datasets. Probably, it should be simplified ... To be described - any help is appreciated

Plugins

Iterative compilation plugins

ccc-run-glob-flags-rnd-uniform

Command-line: <Number of runs> <Compiler name> <Baseline opt> <Rnd seed> <Dataset>

Generate a random combination of compiler flags (50% probability of selecting individual optimization).

ccc-run-glob-flags-rnd-fixed

Command-line: <Number of runs> <Sequence length> <Compiler name> <Baseline opt> <Rnd seed> <Dataset>

Generate a combination of compiler flags of a specified length randomly when performing iterative compilation.

ccc-run-glob-flags-one-by-one

Command-line: <Ignore first option> <Compiler name> <Baseline opt> <Dataset>

Select all optimizations from the compiler optimization list one by one

ccc-run-glob-flags-one-off-rnd

Command-line: "Compiler flags" <Compiler name> <Baseline opt> <Rnd seed> <Time diff tolerance> <Dataset>

Remove flags from the combination of "compiler flags" one by one randomly at each iterative step and put them back if execution time drops. We need this script to find influential flags.

Machine learning plugins

ccc-ml-accumulate-features

This plugin accumulates static program features per function for a given program into single feature vector using MILEPOST GCC. To be updated

ccc-ml-predict-best-flag

This plugin queries ML server to obtained combination of flags or passes on a global or function level to improve execution time or code size. To be updated

Data Analysis plugins

To be updated

get-all-best-flags-time

Report all best optimization cases from the experimental database. To be updated

get-all-best-flags-time-size-paretto

Report optimization cases that improve both execution time and code size based on Paretto distribution. To be updated

Misc

Here you can find projects to extend CCC Framework and plugins. You are welcome to participate or you can submit your projects.