Publication from ADAPT'14 workshop.

Notes on reproducibility:


Roofline-aware DVFS for GPUs
=============

Date: 16-Oct-2013

Author: Cedric Nugteren (http://www.cedricnugteren.nl)

Description: This repository is an online appendix to the
scientific article "Roofline-aware DVFS for GPUs"


Benchmarks
=============

Three types of CUDA benchmarks are tested:
*    Benchmarks from PolyBench/GPU
*    Benchmarks from Parboil (requires Parboil datasets
    to be installed in ~/software/parboil-2.5/datasets/)
*    Two artificial micro-benchmarks


Experimental setup
=============

GPGPU-Sim version 3.2.1 + GPUWattch

(commit 72aaaf6b11b38121d946469f26d85315ff794f29)

Configuration for GPGPU-Sim
-------------

*    Clock frequencies:

        -gpgpu_clock_domains XXX:YYY:XXX:ZZZ

    XXX is the halved core frequency (600-500-400-300).
    YYY is the full core frequency (1200-1000-800-600).
    ZZZ is the memory frequency (900-750-600-450).

*    DRAM latencies:

        -dram_latency XXX

    XXX is the DRAM latency is core clock cycles, reduced
    when scaling the core frequency to keep the latency
    (in seconds) constant (100-83-76-50).

Configuration for GPUWattch
-------------

*    Memory configuration:

        <param name="mc_clock" value="XXX"/>
        <param name="peak_transfer_rate" value="YYY"/>

    XXX is the doubled memory clock or the halved effective
    clock (1800-1500-1200-900). YYY is the bandwidth per
    memory controller (28800-24000-19200-14400).

*    Clock frequencies:

        <param name="target_core_clockrate" value="XXX"/>
        <param name="clockrate" value="XXX"/>
        <param name="NOC_A" value="XXX" />

    XXX is either the halved or full core clock frequency
    in various places in the configuration settings.

*    Memory power parameters:

        <param name="MEM_RD" value="XXX" />
        <param name="MEM_WR" value="YYY" />
        <param name="MEM_PRE" value="ZZZ" />

    XXX, YYY, and ZZZ are scaled with the core clock rate
    to obtain correct memory power characteristics. This
    has been acknowledge to be a bug in the simulator and
    will be repaired in the next version.


Contents of the repository
=============

*    *benchmark_code*

    Folder containing CUDA source code made suitable for
    the GPGPU-Sim simulator.

*    *configurations*

    All the GPGPU-Sim and GPUWattch configuration files.

*    *results*

    Folder containing the graphs as they appear in the
    article plus more detailed graphs. It also contains
    a processed database extracted from simulation data.

*    *simulation_data*

    The raw simulation output from GPGPU-Sim and GPUWattch.

*    *process.r*

    An R-script to process the raw simulation data and
    output a database in CSV format (in results folder).

*    *graph.r*

    An R-script to generate plots based on the database
    generated by the process.r script.

*    *README*

    This file.

###################################################


(C) 2011-2014 cTuning foundation