From cTuning.org
Google Summer of Code 2009: Generic function cloning.
Liang Peng (ICT, China)
Based on Run-time Function Adaptation for Statically-Compiled Programs based on function multiversioning and FunctionSpecificOpt, we enabled generic function cloning with low-overhead program behavior monitoring routines. It will enable fine-grain self-tuning binaries and libraries and will increase performance and portability of the static code which is particularly important for rapidly evolving hardware and virtual enviroments.
Contents |
Function cloning
Description
SIMPLE_IPA_PASS "generic_cloning"
This pass performs generic function cloning. It has the ability to create any number of clones for a given function on demand, and aply different fine-grain optimizations for a given clone, also provide mechanism to select different clones at run-time based on optimization scenarios. It's placed before pass_early_local_passes, and we are trying to put it more earlier.
It can be triggered by GCC option -fapi-clone-function and Adapt plugin. The information come from xml files through ICI.
Key data structures
Information from xml files for function cloning and instrumentation through Interactive Compilation Interface (ICI).
Defined in both gcc/highlev-plugin-internal.h and gcc/highlev-plugin.h:
typedef struct { /* number of function to clone. */ int numofclonefun; /* function name list. */ char **clone_function_list; /* corresponding filename of function. */ char **function_filename_list; /* corresponding number of clones. */ int *clones; /* corresponding function name extension for clone. */ char **clone_extension; /* corresponding adaptation function name for current function . */ char **adaptation_function; /* option list for clone . */ char **clone_option; /* corresponding external libraries. */ char *external_libraries; } cloneinfo ;
New ICI event
ICI event "load_clone_config"
This event is called at the very beginning of function cloning pass.
- Call back function: load_clone_config () Defined in ICI plugin adapt.c
- It gets the current main input filename first using get_feature ("main_input_filename"), then read information from corresponding xml file into datastruct cloneinfo using Mini-XML library.
- ICI event parameter: clone_info A pointer to datastruct cloneinfo
Work flow
For each cgraph node, we check whether it needs to be cloned.
If yes, clone it using cgraph_function_version, apply options for each clone, and insert selection mechanism into original function.
Applying different optimization to clone
To prevent GCC from bugs, the following flags are not allowed to be changed :
flag_strict_aliasing;
flag_omit_frame_pointer;
flag_pcc_struct_return;
flag_asynchronous_unwind_tables;
Selection mechanism and overhead
Overhead
Type | average time: O3 | average time: clone susan_edges() ten times, O3 to all clones | Overhead percent |
---|---|---|---|
"real" | 8.1588s | 8.282778s | 1.5195617% |
"user" | 7.6258s | 7.759111s | 1.7481576% |
If we run dataset 1 on automotive_susan_e, susan_edges() is a hot function that takes about 80% of the whole execute time.
Primary functions
- exec_clone_functions (void)
- This is the execute function for gimple_opt_pass pass_clone_functions. It calls event "load_clone_config" to load information needed by cloning, and clone functions that is in the clone list using cgraph_function_versioning, and insert selection mechanism at last.
- add_call_to_clones (struct cgraph_node *orig, int nid)
- This function insert selection mechanism, which includes call to clones, call to adaptation function, and build switch statement.
- get_arguments (tree tree_list)
- This function get the arguments of original function into tree argv.
- parse_arguments (char *text, unsigned int *argc)
- This function parse a option string, and return the number of argument and argument vector.
- For example parse string "-O3 -fici -fapi-clone-functions", to int argc=3, char **argv={"-O3","-fici","-fapi-clone-functions"}
- find_clone_options (char *funcname, int *nid)
- This function trys to find a option for clone according to clone's name in clone_option list, also records the id.
- is_in_clone_list (const char *func_name, const char *file_name, int *nid)
- This function check whether function:func_name in file:file_name needs to be cloned, return true if yes, false if no, nid records the id in the list.
- is_it_clonable (struct cgraph_node *cg_func)
- check whether cgraph_node:cg_func is clonable.
- is_it_main (struct cgraph_node *cg_func)
- check whether cgraph_node:cg_func is a cgraph node of main/MAIN__.
TODOS
- check and deal with function with name that ends with _clone_%d.
- add support to pragams.
- apply different target optimization to clone.
Instrumentation
Description
SIMPLE_IPA_PASS "instrumentation"
This pass performs function instrumentation. Currently, it has only ability to add function calls. We also have the ability to link external libraries transparently withou Makefile modifications.
This pass also can be triggered by GCC option fapi-instrument-functions and Adapt plugin. Te information come from xml files through ICI.
Key data structures
Information from xml files for function cloning and instrumentation through Interactive Compilation Interface (ICI).
Defined in both gcc/highlev-plugin-internal.h and gcc/highlev-plugin.h:
typedef struct { /* number of function to instrument. */ int numofinstrfun; /* function name list. */ char **instrument_function_list; /* corresponding filename of function. */ char **function_filename_list; /* name of function instrument at the begin of function. */ char **timer1; /* name of function instrument at the end of function. */ char **timer2; /* flag list whether function is cloned. */ char *cloned; } instrinfo ;
New ICI event
ICI event “load_instr_config”
This event is called at the very beginning of function instrumentation pass.
- Call back function : load_instr_config () Defined in ICI plugin adapt.c
- It gets the current main input filename first using get_feature ("main_input_filename"), then read information from corresponding xml file into datastruct instrinfo using Mini-XML library.
- ICI event parameter: clone_info A pointer to datastruct instrinfo
Work flow
Primary functions
- exec_instrument_functions (void)
- Execute function for instrumentation pass, It calls event "load_instr_config" to load information needed by instrumentation, and instrument a external call at the begin/end of function if function is in the instrument_function_list.
- add_timer_begin (struct cgraph_node *cg_func, char *funname, int cloned)
- instrument a external call named funname at the begin of function.
- if cloned == '1' , the external call will be inserted after the first gimple statement since this function is cloned before, so the first statement should be call to select function.
- add_timer_end (struct cgraph_node *cg_func, char *funname)
- instrument a external call named funname at the end of function.
- is_in_instrument_list (const char *func_name, const char *file_name, int *nid)
- This function check whether function:func_name in file:file_name needs to be instrumented, return true if yes, false if no, nid records the id in the list.
- is_it_instrumentable (struct cgraph_node *cg_func)
- Check whether cg_func is instrumentable, return true if yes, false if no.
Run-time monitoring routines based on PAPI
Linking external library
We now have the ability to link external libraries transparently without Makefile modifications. This patch is provided by Yuri Kashnikoff. Currently, we take an enviroment variable ICI_LIBS as input. For example: ICI_LIBS="-Lpath/to/library -lselect".
TODOs
- provide ability to add function calls before or after specific instructions with some program variables as arguments
Work with adapt plugin
Adapt plugin: Fine-grain optimization tuning - another GSOC project by Yuanjie Huang from ICT, China.
- provides support for GCC pass sequence record/substitution, function-specific optimization tuning, function clone and instrumentation. It's controled via environment variable ICI_ADAPT_CONTROL. When this variable is set to 1, information on compilation will be recorded into XML files; while this variable is set to 2, adapt plugin will reuse information from XML and tune GCC compilation workflow via ICI.
Script
adaptutil.py This script works after recording compilation informatoin but before reusing compilation information. The input of this script is a ini format file which describes how to perform function cloning and instrumentation. This script will also generate external library template.
- option -n, --noclone : turn off clone (default on)
- option -i, --instrumentation : turn on instrumentation (default off)
- option -o, --optimization : turn on function specific optimization (default off)
- option -t <filename> : generate external library template
- option --tflavor=FLAVOR : FLAVOR: r for random select function, b for roundrobin
Work flow
Three steps compilation:
1: create current XML with the compilation flow and info
- > adapt_compile_step.sh 1
- all xml files are put in directory $ICI_ADAPT_XMLDIR
2: modify xml files based on an ini format file to turn
- on function cloning pass or/and instrumentation pass
- > adaptutil.py a.ini $ICI_ADAPT_XMLDIR
- xml files in $ICI_ADAPT_XMLDIR will be modified to
- (compile the user provided external library)
3: compiles program, clone functions, apply optimization
- flags to clones
- > adapt_compile_step.sh 2
TODOS
- automatically select clones using machine learning techniques based on PAPI and dataset features.
- evaluate different optimizatioins for different datasets and architectures using Continuous Collective Compilation Framework (CCC)