Line 10: Line 10:
  
 
*Datasets:
 
*Datasets:
**D1 = image raw fgg office day gray ({{CREF|8a7141c59cd335f5:c8848a1b1fb1775e}}), size=1536x1536~2.4E6 (pixels or neurons)
+
**D1 = grayscale image 1 ({{CREF|8a7141c59cd335f5:c8848a1b1fb1775e}}), size=1536x1536~2.4E6 (pixels or neurons)
**D2 = image raw fgg office night gray ({{CREF|8a7141c59cd335f5:0045c9b59e84318b}}), size=1536x1536~2.4E6
+
**D2 = grayscale image 2 ({{CREF|8a7141c59cd335f5:0045c9b59e84318b}}), size=1536x1536~2.4E6 (pixels or neurons)
  
 
*Systems:
 
*Systems:
**S1 = Dell Laptop Latitude E6320, Processor=P1, Memory = 8Gb, Storage=256Gb (SSD), Max power consumption=52W, Cost (time of purchase)~1400 euros ({{CREF|cb7e6b406491a11c:0d84339816de0271}})
+
**S1 = Dell Laptop Latitude E6320, Processor=P1, Memory = 8Gb, Storage=256Gb (SSD), Max power consumption=52W, Cost (time of purchase)~1200 euros ({{CREF|cb7e6b406491a11c:0d84339816de0271}})
**S2 = Samsung Mobile Galaxy Duos GT-S6312, Processor=P2, Memory = 0.8Gb, Storage=4Gb, Battery=1300 mAh / 3.9V / up to 250 hours, Max power consumption~5W, Cost (time of purchase) =170 euros ({{CREF|cb7e6b406491a11c:a9740acbe06bcd1e}})
+
**S2 = Samsung Mobile Galaxy Duos GT-S6312, Processor=P2, Memory = 0.8Gb, Storage=4Gb, Battery=1300 mAh / 3.9V / up to 250 hours, Max power consumption~5W, Cost (time of purchase)~200 euros ({{CREF|cb7e6b406491a11c:a9740acbe06bcd1e}})
**S3 = Polaroid Tablet Executive 9", Processor=P3, Memory=1Gb, Storage=16Gb, Battery=3500 mAh / 3.9V / up to 80 hours, Max power consumption~, Cost (time of purchase)~110 euros ()
+
**S3 = Polaroid Tablet Executive 9", Processor=P3, Memory=1Gb, Storage=16Gb, Battery=3500 mAh / 3.9V / up to 80 hours, Max power consumption~13W, Cost (time of purchase)~100 euros ()
  
 
*Processors:
 
*Processors:
 
**P1 = Intel Core i5-2540M, 2.60GHz, 2 cores ({{CREF|54cd38490124ef51:425ae4e3483c82e8}})
 
**P1 = Intel Core i5-2540M, 2.60GHz, 2 cores ({{CREF|54cd38490124ef51:425ae4e3483c82e8}})
 
**P2 = Qualcomm MSM7625A FFA, ARM Cortex A5, ARMv7, 1 GHz, 1 core ({{CREF|54cd38490124ef51:ae17889f40209ae7}})
 
**P2 = Qualcomm MSM7625A FFA, ARM Cortex A5, ARMv7, 1 GHz, 1 core ({{CREF|54cd38490124ef51:ae17889f40209ae7}})
**P3 = 1.6GHz,
+
**P3 = Allwinner A20 (sun7i), Dual-Core ARM Cortex A7, ARMv7, 1.6GHz, Mali400 GPU, 2 core ()
**P4 = NVidia Quadro NVS 135M, 16 processors,
+
**P4 = NVidia Quadro NVS 135M, 16 cores, 400MHz ()
 
*Processor mode:
 
*Processor mode:
 
**B1 = 32 bit
 
**B1 = 32 bit
Line 28: Line 28:
  
 
*OSs:
 
*OSs:
**O1 = Windows 7 Pro SP1 ({{CREF|c4d3ce728f46eea2:10c4f7484446b689}})
+
**O1 = Windows 7 Pro SP1,  cost~170 euros ({{CREF|c4d3ce728f46eea2:10c4f7484446b689}})
**O2 = OpenSuse 12.1, Kernel 3.1.10 ({{CREF|c4d3ce728f46eea2:29ce89f1a1446e89}})
+
**O2 = O1, MinGW32
**O3 = Android 4.1.2, Kernel 3.4.0 ({{CREF|c4d3ce728f46eea2:e734c48d5a5824c1}})
+
**O2 = OpenSuse 12.1, Kernel 3.1.10, cost=free ({{CREF|c4d3ce728f46eea2:29ce89f1a1446e89}})
**O4 = Android 4.
+
**O3 = Android 4.1.2, Kernel 3.4.0, cost=free ({{CREF|c4d3ce728f46eea2:e734c48d5a5824c1}})
 +
**O4 = Android 4.2.2, Kernel 3.3.0, cost=free ()
  
 
*Compilers:
 
*Compilers:
**LLVMXYZ = LLVM X.Y.Z
+
**C1 = GCC 4.1.1
**GCCXYZ = GCC X.Y.Z
+
**C2 = GCC 4.4.1
**MGCCXYZ = MingW X.Y.Z
+
**C3 = GCC 4.4.4
**SGCCXYZ = Sourcergy GCC X.Y.Z. for ARM
+
**C4 = GCC 4.6.3
**MX = Microsoft Visual Studio compilers X
+
**C5 = GCC 4.7.2
**IX = Intel X
+
**C6 = GCC 4.8.3
 +
**C7 = GCC 4.9.1
 +
**C8 = LLVM 3.1
 +
**C9 = LLVM 3.3.2
 +
**C10 = Open64 5.0
 +
**C11 = PathScale 2.3.1
 +
**C12 = NVidia CUDA Toolkit 5.0
 +
**C13 = Microsoft Visual Studio 2013
 +
**C14 = Intel Composer XE 2011
  
 
*Number of run-time code repetitions (for example, processing steps in neural networks):
 
*Number of run-time code repetitions (for example, processing steps in neural networks):
Line 45: Line 54:
 
**R2 = 1000
 
**R2 = 1000
 
**R3 = 400
 
**R3 = 400
 +
 
*Total number of computations (processed neurons or pixels)
 
*Total number of computations (processed neurons or pixels)
 
**T1 ~ 9.6E9
 
**T1 ~ 9.6E9

Revision as of 15:25, 22 August 2014

Computational species "bw filter simplified less" (CID=45741e3fbcf4024b:1db78910464c9d05)

Notes

Some cost-aware experiments (execution time, size, energy, compilation time) performed by Grigori Fursin. It supports our research on continuous performance tracking, code optimization and compiler benchmarking (regression detection).

This computation species (kernel) is a threshold filter - it is used in image processing and neuron activation functions (part of artificial neural networks).

Used artifacts

  • Systems:
    • S1 = Dell Laptop Latitude E6320, Processor=P1, Memory = 8Gb, Storage=256Gb (SSD), Max power consumption=52W, Cost (time of purchase)~1200 euros (CID=cb7e6b406491a11c:0d84339816de0271)
    • S2 = Samsung Mobile Galaxy Duos GT-S6312, Processor=P2, Memory = 0.8Gb, Storage=4Gb, Battery=1300 mAh / 3.9V / up to 250 hours, Max power consumption~5W, Cost (time of purchase)~200 euros (CID=cb7e6b406491a11c:a9740acbe06bcd1e)
    • S3 = Polaroid Tablet Executive 9", Processor=P3, Memory=1Gb, Storage=16Gb, Battery=3500 mAh / 3.9V / up to 80 hours, Max power consumption~13W, Cost (time of purchase)~100 euros ()
  • Processors:
  • Processor mode:
    • B1 = 32 bit
    • B2 = 64 bit
  • Compilers:
    • C1 = GCC 4.1.1
    • C2 = GCC 4.4.1
    • C3 = GCC 4.4.4
    • C4 = GCC 4.6.3
    • C5 = GCC 4.7.2
    • C6 = GCC 4.8.3
    • C7 = GCC 4.9.1
    • C8 = LLVM 3.1
    • C9 = LLVM 3.3.2
    • C10 = Open64 5.0
    • C11 = PathScale 2.3.1
    • C12 = NVidia CUDA Toolkit 5.0
    • C13 = Microsoft Visual Studio 2013
    • C14 = Intel Composer XE 2011
  • Number of run-time code repetitions (for example, processing steps in neural networks):
    • R1 = 4000
    • R2 = 1000
    • R3 = 400
  • Total number of computations (processed neurons or pixels)
    • T1 ~ 9.6E9
    • T2 ~ 2.4E9
    • T3 ~ 1.0E9

Notes

Energy: 1Wh = 3600 joules

W = mAh * V / 1000 = 1300 * 3.9 / 1000 ~ 5W


(C) 2011-2014 cTuning foundation