FLiT
Locating Floating-Point Variability Induced By Compiler Optimizations
Ganesh Gopalakrishnan
University of Utah
Acknowledgements to
Michael Bentley (author of FLiT)
John Jacobson (bug-fixes, development)
Cayden Lund (beginning GPU-FLiT)
Ignacio Laguna
Lawrence Livermore National Laboratory
1
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-PRES-780623).
http://fpanalysistools.org/
Numerical Reproducibility Across Compilers
… desired, but not guaranteed under optimizations
Compiling using aggressive optimizations (e.g., -O3 and Fast-Math) can give vastly different program results
This can seriously undermine one's quest for higher speed … by giving the wrong answer!
2
http://fpanalysistools.org/
Example of Compiler-Induced Variability
Laghos: A high-order Lagrangian hydrodynamics mini-application
xlc -O2
xlc -O3
In one iteration: 11.2% relative error, negative gas density!
And speedup by a factor of 2.42
What happened? How can I investigate it?
3
http://fpanalysistools.org/
FLiT Workflow
Multiple Levels:
4
http://fpanalysistools.org/
Basics : Delta Debugging (added for class, but from Prof. Mayur Naik's Youtube lectures)
5
http://fpanalysistools.org/
Basics : Delta Debugging (added for class, but from Prof. Mayur Naik's Youtube lectures)
6
http://fpanalysistools.org/
Consider an example where the presence of A and B together is necessary to trigger a bug
7
http://fpanalysistools.org/
Basics : Delta Debugging (added for class, but from Prof. Mayur Naik's Youtube lectures)
8
http://fpanalysistools.org/
FLiT Installation
FLiT is easy to install
9
git $ git clone https://github.com/PRUNERS/FLiT.git
Cloning into 'FLiT'...
[...]
git $ cd FLiT
FLiT $ make
src/timeFunction.cpp -> src/timeFunction.o
src/flitHelpers.cpp -> src/flitHelpers.o
src/TestBase.cpp -> src/TestBase.o
src/flit.cpp -> src/flit.o
src/FlitCsv.cpp -> src/FlitCsv.o
src/InfoStream.cpp -> src/InfoStream.o
src/subprocess.cpp -> src/subprocess.o
src/Variant.cpp -> src/Variant.o
src/fsutil.cpp -> src/fsutil.o
mkdir lib
Building lib/libflit.so
FLiT $ sudo make install
Installing...
Generating /usr/share/flit/scripts/flitconfig.py
FLiT $ sudo apt install python3-toml python3-pyelftools
[...]
git $ git clone https://github.com/PRUNERS/FLiT.git
Cloning into 'FLiT'...
[...]
git $ cd FLiT
FLiT $ make
src/timeFunction.cpp -> src/timeFunction.o
src/flitHelpers.cpp -> src/flitHelpers.o
src/TestBase.cpp -> src/TestBase.o
src/flit.cpp -> src/flit.o
src/FlitCsv.cpp -> src/FlitCsv.o
src/InfoStream.cpp -> src/InfoStream.o
src/subprocess.cpp -> src/subprocess.o
src/Variant.cpp -> src/Variant.o
src/fsutil.cpp -> src/fsutil.o
mkdir lib
Building lib/libflit.so
FLiT $ sudo make install
Installing...
Generating /usr/share/flit/scripts/flitconfig.py
FLiT $ sudo apt install python3-toml python3-pyelftools
[...]
http://fpanalysistools.org/
Multi-Compilation Search
10
FLiT is a reproducibility test
framework in the PRUNERS
toolset (pruners.github.io).
Hundreds of compilations are compared against a baseline compilation.
http://fpanalysistools.org/
Exercises
11
http://fpanalysistools.org/
Exercises with FLiT
12
Directory Structure
Module-FLiT/
├── exercise-1/
├── exercise-2/
├── exercise-3/
├── packages/
├── README.md
└── setup.sh
this tutorial
partly this tutorial
try on your
own later
http://fpanalysistools.org/
Exercise 1
13
http://fpanalysistools.org/
Exercise 1 - Goal
14
http://fpanalysistools.org/
Application: MFEM
15
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
What does it take to create a FLiT test from an MFEM example?
Let’s find out!
16
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
Let’s look at the test for MFEM example #13
tests/Mfem13.cpp
or
Further details skipped; provided at the end of this slide deck
17
Module-FLiT $ cd exercise-1
exercise-1 $ vim tests/MFEM13.cpp
exercise-1 $ pygmentize tests/Mfem13.cpp | cat -n
http://fpanalysistools.org/
Exercise 1 - Run the MFEM Test
Each command has a script.
Run the script or the command from the slide - your choice
18
http://fpanalysistools.org/
Exercise 1 - ./step-01.sh
19
exercise-1 $ flit update
Creating ./Makefile
exercise-1 $ flit update
Creating ./Makefile
http://fpanalysistools.org/
Exercise 1 - ./step-02.sh
(takes about 1 minute)
20
exercise-1 $ make runbuild -j1
mkdir obj/gt
/home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp -> obj/gt/densemat.cpp.o
main.cpp -> obj/gt/main.cpp.o
tests/Mfem13.cpp -> obj/gt/Mfem13.cpp.o
Building gtrun
mkdir bin
mkdir obj/GCC_ip-172-31-8-101_FFAST_MATH_O3
/home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp -> obj/GCC_ip-172-31-8[...]
[...]
exercise-1 $ make runbuild -j1
mkdir obj/gt
/home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp -> obj/gt/densemat.cpp.o
main.cpp -> obj/gt/main.cpp.o
tests/Mfem13.cpp -> obj/gt/Mfem13.cpp.o
Building gtrun
mkdir bin
mkdir obj/GCC_ip-172-31-8-101_FFAST_MATH_O3
/home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp -> obj/GCC_ip-172-31-8[...]
[...]
http://fpanalysistools.org/
Exercise 1 - ./step-02.sh
A reminder about what is going on here...
21
http://fpanalysistools.org/
Exercise 1 - ./step-03.sh
(takes about 1 minute)
22
exercise-1 $ make run -j1
mkdir results
gtrun -> ground-truth.csv
results/GCC_ip-172-31-8-101_FFAST_MATH_O3-out -> results/GCC_ip-172-31-8-101_FFA[...]
results/GCC_ip-172-31-8-101_FUNSAFE_MATH_OPTIMIZATIONS_O3-out -> results/GCC_ip-[...]
results/GCC_ip-172-31-8-101_MFMA_O3-out -> results/GCC_ip-172-31-8-101_MFMA_O3-o[...]
results/CLANG_ip-172-31-8-101_FFAST_MATH_O3-out -> results/CLANG_ip-172-31-8-101[...]
results/CLANG_ip-172-31-8-101_FUNSAFE_MATH_OPTIMIZATIONS_O3-out -> results/CLANG[...]
results/CLANG_ip-172-31-8-101_MFMA_O3-out -> results/CLANG_ip-172-31-8-101_MFMA_[...]
[...]
exercise-1 $ make run -j1
mkdir results
gtrun -> ground-truth.csv
results/GCC_ip-172-31-8-101_FFAST_MATH_O3-out -> results/GCC_ip-172-31-8-101_FFA[...]
results/GCC_ip-172-31-8-101_FUNSAFE_MATH_OPTIMIZATIONS_O3-out -> results/GCC_ip-[...]
results/GCC_ip-172-31-8-101_MFMA_O3-out -> results/GCC_ip-172-31-8-101_MFMA_O3-o[...]
results/CLANG_ip-172-31-8-101_FFAST_MATH_O3-out -> results/CLANG_ip-172-31-8-101[...]
results/CLANG_ip-172-31-8-101_FUNSAFE_MATH_OPTIMIZATIONS_O3-out -> results/CLANG[...]
results/CLANG_ip-172-31-8-101_MFMA_O3-out -> results/CLANG_ip-172-31-8-101_MFMA_[...]
[...]
http://fpanalysistools.org/
Exercise 1 - Analyze Results
Let us look at the generated results
They are in the results/ directory
23
http://fpanalysistools.org/
Exercise 1 - ./step-04.sh
Creates results.sqlite
24
exercise-1 $ flit import results/*.csv
Creating results.sqlite
Importing results/CLANG_yoga-manjaro_FFAST_MATH_O3-out-comparison.csv
Importing results/CLANG_yoga-manjaro_FUNSAFE_MATH_OPTIMIZATIONS_O3-out-comparison.csv
Importing results/CLANG_yoga-manjaro_MFMA_O3-out-comparison.csv
Importing results/GCC_yoga-manjaro_FFAST_MATH_O3-out-comparison.csv
Importing results/GCC_yoga-manjaro_FUNSAFE_MATH_OPTIMIZATIONS_O3-out-comparison.csv
Importing results/GCC_yoga-manjaro_MFMA_O3-out-comparison.csv
exercise-1 $ flit import results/*.csv
Creating results.sqlite
Importing results/CLANG_yoga-manjaro_FFAST_MATH_O3-out-comparison.csv
Importing results/CLANG_yoga-manjaro_FUNSAFE_MATH_OPTIMIZATIONS_O3-out-comparison.csv
Importing results/CLANG_yoga-manjaro_MFMA_O3-out-comparison.csv
Importing results/GCC_yoga-manjaro_FFAST_MATH_O3-out-comparison.csv
Importing results/GCC_yoga-manjaro_FUNSAFE_MATH_OPTIMIZATIONS_O3-out-comparison.csv
Importing results/GCC_yoga-manjaro_MFMA_O3-out-comparison.csv
http://fpanalysistools.org/
Exercise 1 - ./step-05.sh
Two tables in the database:
25
exercise-1 $ sqlite3 results.sqlite
SQLite version 3.28.0 2019-04-16 19:49:53
Enter ".help" for usage hints.
sqlite> .tables
runs tests
sqlite> .headers on
sqlite> .mode column
sqlite> select * from runs;
id rdate label
---------- -------------------------- ------------------
1 2019-07-08 23:05:19.358055 First FLiT Results
exercise-1 $ sqlite3 results.sqlite
SQLite version 3.28.0 2019-04-16 19:49:53
Enter ".help" for usage hints.
sqlite> .tables
runs tests
sqlite> .headers on
sqlite> .mode column
sqlite> select * from runs;
id rdate label
---------- -------------------------- ------------------
1 2019-07-08 23:05:19.358055 First FLiT Results
http://fpanalysistools.org/
Exercise 1 - ./step-06.sh
One compilation had 193% relative error!
The others had no error.
Now to find the sites in the source code
26
sqlite> select compiler, optl, switches, comparison, nanosec from tests;
compiler optl switches comparison nanosec
----------- ---------- ----------- ---------- ----------
clang++-6.0 -O3 -ffast-math 0.0 2857386994
clang++-6.0 -O3 -funsafe-ma 0.0 2853588952
clang++-6.0 -O3 -mfma 0.0 2858789982
g++-7 -O3 -ffast-math 0.0 2841191528
g++-7 -O3 -funsafe-ma 0.0 2868636192
g++-7 -O3 -mfma 193.007351 2797305220
sqlite> .q
sqlite> select compiler, optl, switches, comparison, nanosec from tests;
compiler optl switches comparison nanosec
----------- ---------- ----------- ---------- ----------
clang++-6.0 -O3 -ffast-math 0.0 2857386994
clang++-6.0 -O3 -funsafe-ma 0.0 2853588952
clang++-6.0 -O3 -mfma 0.0 2858789982
g++-7 -O3 -ffast-math 0.0 2841191528
g++-7 -O3 -funsafe-ma 0.0 2868636192
g++-7 -O3 -mfma 193.007351 2797305220
sqlite> .q
http://fpanalysistools.org/
Exercise 2
27
exercise-1 $ cd ../exercise-2
http://fpanalysistools.org/
Exercise 2 - FLiT Bisect
We want to find the file(s)/function(s) where FMA caused 193% relative error
Compilation: g++-7 -O3 -mfma
28
http://fpanalysistools.org/
Exercise 2 - ./step-07.sh
What’s Different?
29
exercise-2 $ diff -u ../exercise-1/custom.mk ./custom.mk
--- ../exercise-1/custom.mk 2019-07-01 16:09:39.239923037 -0600
+++ custom.mk 2019-07-01 16:07:41.090571010 -0600
@@ -17,9 +17,15 @@
#SOURCE += $(wildcard ${MFEM_SRC}/linalg/*.cpp)
#SOURCE += $(wildcard ${MFEM_SRC}/mesh/*.cpp)
-# just the one source file to see there is a difference
SOURCE += ${MFEM_SRC}/linalg/densemat.cpp # where the bug is
+# a few more files to make the search space a bit more interesting
+SOURCE += ${MFEM_SRC}/linalg/matrix.cpp
+SOURCE += ${MFEM_SRC}/fem/gridfunc.cpp
+SOURCE += ${MFEM_SRC}/fem/linearform.cpp
+SOURCE += ${MFEM_SRC}/mesh/point.cpp
+SOURCE += ${MFEM_SRC}/mesh/quadrilateral.cpp
+
CC_REQUIRED += -I${MFEM_SRC}
CC_REQUIRED += -I${MFEM_SRC}/examples
CC_REQUIRED += -isystem ${HYPRE_SRC}/src/hypre/include
exercise-2 $ diff -u ../exercise-1/custom.mk ./custom.mk
--- ../exercise-1/custom.mk 2019-07-01 16:09:39.239923037 -0600
+++ custom.mk 2019-07-01 16:07:41.090571010 -0600
@@ -17,9 +17,15 @@
#SOURCE += $(wildcard ${MFEM_SRC}/linalg/*.cpp)
#SOURCE += $(wildcard ${MFEM_SRC}/mesh/*.cpp)
-# just the one source file to see there is a difference
SOURCE += ${MFEM_SRC}/linalg/densemat.cpp # where the bug is
+# a few more files to make the search space a bit more interesting
+SOURCE += ${MFEM_SRC}/linalg/matrix.cpp
+SOURCE += ${MFEM_SRC}/fem/gridfunc.cpp
+SOURCE += ${MFEM_SRC}/fem/linearform.cpp
+SOURCE += ${MFEM_SRC}/mesh/point.cpp
+SOURCE += ${MFEM_SRC}/mesh/quadrilateral.cpp
+
CC_REQUIRED += -I${MFEM_SRC}
CC_REQUIRED += -I${MFEM_SRC}/examples
CC_REQUIRED += -isystem ${HYPRE_SRC}/src/hypre/include
http://fpanalysistools.org/
Exercise 2 - ./step-08.sh
Again, we need to regenerate the Makefile
Before we bisect, remember which compilation caused a problem:
g++-7 -O3 -mfma
30
exercise-2 $ flit update
Creating ./Makefile
exercise-2 $ flit update
Creating ./Makefile
http://fpanalysistools.org/
Exercise 2 - ./step-09.sh
(takes approximately 1 minute 30 seconds)
31
exercise-2 $ flit bisect --precision=double “g++-7 -O3 -mfma” Mfem13
Updating ground-truth results - ground-truth.csv - done
Searching for differing source files:
Created ./bisect-04/bisect-make-01.mk - compiling and running - score 193.00735125466363
Created ./bisect-04/bisect-make-02.mk - compiling and running - score 193.00735125466363
Created ./bisect-04/bisect-make-03.mk - compiling and running - score 0.0
Created ./bisect-04/bisect-make-04.mk - compiling and running - score 193.00735125466363
Found differing source file /home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp: score 193.00735125466363
[...]
All variability inducing symbols:
/home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp:3692 _ZN4mfem13AddMult_a_AAtEdRKNS_11DenseMatrixERS0_ -- mfem::AddMult_a_AAt(double, mfem::DenseMatrix const&, mfem::DenseMatrix&) (score 193.00735125466363)
exercise-2 $ flit bisect --precision=double “g++-7 -O3 -mfma” Mfem13
Updating ground-truth results - ground-truth.csv - done
Searching for differing source files:
Created ./bisect-04/bisect-make-01.mk - compiling and running - score 193.00735125466363
Created ./bisect-04/bisect-make-02.mk - compiling and running - score 193.00735125466363
Created ./bisect-04/bisect-make-03.mk - compiling and running - score 0.0
Created ./bisect-04/bisect-make-04.mk - compiling and running - score 193.00735125466363
Found differing source file /home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp: score 193.00735125466363
[...]
All variability inducing symbols:
/home/user1/Module-FLiT/packages/mfem/linalg/densemat.cpp:3692 _ZN4mfem13AddMult_a_AAtEdRKNS_11DenseMatrixERS0_ -- mfem::AddMult_a_AAt(double, mfem::DenseMatrix const&, mfem::DenseMatrix&) (score 193.00735125466363)
http://fpanalysistools.org/
Exercise 2 - Bisect Details
First locate variability files
Approach: combine object files from the two compilations
32
http://fpanalysistools.org/
Exercise 2 - Bisect Details
Approach: combine symbols after compilation
Convert function symbols into weak symbols
Downside: Requires recompiling with -fPIC
33
http://fpanalysistools.org/
Exercise 2 - ./step-10.sh
Computes
34
exercise-2 $ cat -n ../packages/mfem/linalg/densemat.cpp | tail -n +3688 | head -n 24
3688 void AddMult_a_AAt(double a, const DenseMatrix &A, DenseMatrix &AAt)
3689 {
3690 double d;
3691
3692 for (int i = 0; i < A.Height(); i++)
3693 {
3694 for (int j = 0; j < i; j++)
3695 {
3696 d = 0.;
3697 for (int k = 0; k < A.Width(); k++)
3698 {
3699 d += A(i,k) * A(j,k);
3700 }
3701 AAt(i, j) += (d *= a);
3702 AAt(j, i) += d;
3703 }
3704 d = 0.;
3705 for (int k = 0; k < A.Width(); k++)
3706 {
3707 d += A(i,k) * A(i,k);
3708 }
3709 AAt(i, i) += a * d;
3710 }
3711 }
exercise-2 $ cat -n ../packages/mfem/linalg/densemat.cpp | tail -n +3688 | head -n 24
3688 void AddMult_a_AAt(double a, const DenseMatrix &A, DenseMatrix &AAt)
3689 {
3690 double d;
3691
3692 for (int i = 0; i < A.Height(); i++)
3693 {
3694 for (int j = 0; j < i; j++)
3695 {
3696 d = 0.;
3697 for (int k = 0; k < A.Width(); k++)
3698 {
3699 d += A(i,k) * A(j,k);
3700 }
3701 AAt(i, j) += (d *= a);
3702 AAt(j, i) += d;
3703 }
3704 d = 0.;
3705 for (int k = 0; k < A.Width(); k++)
3706 {
3707 d += A(i,k) * A(i,k);
3708 }
3709 AAt(i, i) += a * d;
3710 }
3711 }
http://fpanalysistools.org/
CONCLUDING REMARKS
35
http://fpanalysistools.org/
Ongoing Work
36
http://fpanalysistools.org/
How much faster? Good-enough accuracy to backoff bad optimization in a timely way?
37
http://fpanalysistools.org/
Advanced exercises now follow (including LULESH)
38
http://fpanalysistools.org/
Exercise 3
39
exercise-2 $ cd ../exercise-3
http://fpanalysistools.org/
Exercise 3 Application: LULESH
Goal: explore more FLiT Bisect functionality
40
http://fpanalysistools.org/
Exercise 3 - ./step-11.sh
Five variability compilations.
Let’s investigate!
41
exercise-3 $ sqlite3 results.sqlite
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
sqlite> .headers on
sqlite> .mode column
sqlite> select compiler, optl, switches, comparison, nanosec from tests;
compiler optl switches comparison nanosec
----------- ---------- ----------------- -------------------- ----------
clang++-6.0 -O3 -freciprocal-math 5.52511478433538e-05 432218541
clang++-6.0 -O3 -funsafe-math-opt 5.52511478433538e-05 432185456
clang++-6.0 -O3 0.0 433397072
g++-7 -O3 -freciprocal-math 5.52511478433538e-05 441362811
g++-7 -O3 -funsafe-math-opt 7.02432004920159 436202864
g++-7 -O3 -mavx2 -mfma 1.02330009691563 416599918
g++-7 -O3 0.0 432654778
sqlite> .q
exercise-3 $ sqlite3 results.sqlite
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
sqlite> .headers on
sqlite> .mode column
sqlite> select compiler, optl, switches, comparison, nanosec from tests;
compiler optl switches comparison nanosec
----------- ---------- ----------------- -------------------- ----------
clang++-6.0 -O3 -freciprocal-math 5.52511478433538e-05 432218541
clang++-6.0 -O3 -funsafe-math-opt 5.52511478433538e-05 432185456
clang++-6.0 -O3 0.0 433397072
g++-7 -O3 -freciprocal-math 5.52511478433538e-05 441362811
g++-7 -O3 -funsafe-math-opt 7.02432004920159 436202864
g++-7 -O3 -mavx2 -mfma 1.02330009691563 416599918
g++-7 -O3 0.0 432654778
sqlite> .q
http://fpanalysistools.org/
Exercise 3 - ./step-12.sh
Nothing surprising here...
42
exercise-3 $ flit update
Creating ./Makefile
exercise-3 $ flit update
Creating ./Makefile
http://fpanalysistools.org/
Exercise 3 - ./step-13.sh
(takes approximately 3 min 10 sec)
Will automatically run all rows with comparison > 0.0
Let’s look at the Bisect algorithm
43
exercise-3 $ flit bisect --auto-sqlite-run results.sqlite --parallel=1 --jobs=1
Before parallel bisect run, compile all object files
(1 of 5) clang++ -O3 -freciprocal-math: done
(2 of 5) clang++ -O3 -funsafe-math-optimizations: done
(3 of 5) g++ -O3 -freciprocal-math: done
(4 of 5) g++ -O3 -funsafe-math-optimizations: done
(5 of 5) g++ -O3 -mavx2 -mfma: done
Updating ground-truth results - ground-truth.csv - done
Run 1 of 5
flit bisect --precision double "clang++ -O3 -freciprocal-math" LuleshTest
Updating ground-truth results - ground-truth.csv - done
Searching for differing source files:
[...]
exercise-3 $ flit bisect --auto-sqlite-run results.sqlite --parallel=1 --jobs=1
Before parallel bisect run, compile all object files
(1 of 5) clang++ -O3 -freciprocal-math: done
(2 of 5) clang++ -O3 -funsafe-math-optimizations: done
(3 of 5) g++ -O3 -freciprocal-math: done
(4 of 5) g++ -O3 -funsafe-math-optimizations: done
(5 of 5) g++ -O3 -mavx2 -mfma: done
Updating ground-truth results - ground-truth.csv - done
Run 1 of 5
flit bisect --precision double "clang++ -O3 -freciprocal-math" LuleshTest
Updating ground-truth results - ground-truth.csv - done
Searching for differing source files:
[...]
http://fpanalysistools.org/
How to Perform the Search
Assumption 2: variability sites act alone
Assumption 1: errors do not exactly cancel
44
http://fpanalysistools.org/
Bisect
Algorithm
45
http://fpanalysistools.org/
Exercise 3 - ./step-14.sh
Results are placed in a CSV file for easy access
46
exercise-3 $ head -n 3 auto-bisect.csv
testid,bisectnum,compiler,optl,switches,precision,testcase,type,name,return
1,1,clang++,-O3,-freciprocal-math,double,LuleshTest,completed,"lib,src,sym",0
1,1,clang++,-O3,-freciprocal-math,double,LuleshTest,src,"('tests/LuleshTest.cpp', 0.33294020544031533)",0
exercise-3 $ head -n 3 auto-bisect.csv
testid,bisectnum,compiler,optl,switches,precision,testcase,type,name,return
1,1,clang++,-O3,-freciprocal-math,double,LuleshTest,completed,"lib,src,sym",0
1,1,clang++,-O3,-freciprocal-math,double,LuleshTest,src,"('tests/LuleshTest.cpp', 0.33294020544031533)",0
http://fpanalysistools.org/
Exercise 3 - Bonus
47
http://fpanalysistools.org/
Exercise 3 - efficiency
The 4th run (from auto-run) took 34 compilation / run steps.
Can we do better?
What if we only want the top contributing function?
48
Run 4 of 5
flit bisect --precision double "g++ -O3 -funsafe-math-optimizations" LuleshTest
[...]
All variability inducing symbols:
../packages/LULESH/lulesh-init.cc:16 _ZN6DomainC1Eiiiiiiiii -- Domain::Domain(int, int, int, int, int, int, int, int, int) (score 2.3302358973548727)
../packages/LULESH/lulesh-init.cc:219 _ZN6Domain9BuildMeshEiii -- Domain::BuildMesh(int, int, int) (score 1.4315005606175104)
../packages/LULESH/lulesh.cc:1362 _Z14CalcElemVolumePKdS0_S0_ -- CalcElemVolume(double const*, double const*, double const*) (score 0.9536115035892543)
../packages/LULESH/lulesh.cc:1507 _Z22CalcKinematicsForElemsR6Domaindi -- CalcKinematicsForElems(Domain&, double, int) (score 0.665781828022106)
../packages/LULESH/lulesh.cc:2651 _Z11lulesh_mainiPPc -- lulesh_main(int, char**) (score 0.3328909140110529)
http://fpanalysistools.org/
Exercise 3 - ./step-15.sh
49
exercise-3 $ flit bisect --biggest=1 --precision=double "g++-7 -O3 -funsafe-math-optimizations" LuleshTest
Updating ground-truth results - ground-truth.csv - done
Looking for the top 1 different symbol(s) by starting with files
[...]
Found differing source file ../packages/LULESH/lulesh-init.cc: score 3.7609285311270604
Searching for differing symbols in: ../packages/LULESH/lulesh-init.cc
[...]
Found differing symbol on line 16 -- Domain::Domain(int, int, int, int, int, int, int, int, int) (score 2.3302358973548727)
[...]
Created ./bisect-06/bisect-make-20.mk - compiling and running - score 0.022750390077923448
Found differing source file tests/LuleshTest.cpp: score 0.022750390077923448
[...]
The 1 highest variability symbol:
../packages/LULESH/lulesh-init.cc:16 _ZN6DomainC1Eiiiiiiiii -- Domain::Domain(int, int, int, int, int, int, int, int, int) (score 2.3302358973548727)
exercise-3 $ flit bisect --biggest=1 --precision=double "g++-7 -O3 -funsafe-math-optimizations" LuleshTest
Updating ground-truth results - ground-truth.csv - done
Looking for the top 1 different symbol(s) by starting with files
[...]
Found differing source file ../packages/LULESH/lulesh-init.cc: score 3.7609285311270604
Searching for differing symbols in: ../packages/LULESH/lulesh-init.cc
[...]
Found differing symbol on line 16 -- Domain::Domain(int, int, int, int, int, int, int, int, int) (score 2.3302358973548727)
[...]
Created ./bisect-06/bisect-make-20.mk - compiling and running - score 0.022750390077923448
Found differing source file tests/LuleshTest.cpp: score 0.022750390077923448
[...]
The 1 highest variability symbol:
../packages/LULESH/lulesh-init.cc:16 _ZN6DomainC1Eiiiiiiiii -- Domain::Domain(int, int, int, int, int, int, int, int, int) (score 2.3302358973548727)
http://fpanalysistools.org/
Thank You!
Questions?
pruners.github.io/flit
50
http://fpanalysistools.org/
Details of test creation now follow
51
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
Things to notice:
52
6 // Redefine main() to avoid name clash. This is the function we will test
7 #define main mfem_13p_main
8 #include "ex13p.cpp"
9 #undef main
10 // Register it so we can use it in call_main() or call_mpi_main()
11 FLIT_REGISTER_MAIN(mfem_13p_main);
tests/MFEM13.cpp
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
53
14 template <typename T>
15 class Mfem13 : public flit::TestBase<T> {
16 public:
17 Mfem13(std::string id) : flit::TestBase<T>(std::move(id)) {}
18 virtual size_t getInputsPerRun() override { return 0; }
19 virtual std::vector<T> getDefaultInput() override { return { }; }
20
21 virtual long double compare(const std::vector<std::string> &ground_truth,
22 const std::vector<std::string> &test_results) const override {
23-50 [...]
51 }
tests/MFEM13.cpp
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
54
64 // Only implement the test for double precision
65 template<>
66 flit::Variant Mfem13<double>::run_impl(const std::vector<double> &ti) {
67 FLIT_UNUSED(ti);
68
69 // Run in a temporary directory so output files don't clash
70 std::string start_dir = flit::curdir();
71 flit::TempDir exec_dir;
72 flit::PushDir pusher(exec_dir.name());
tests/MFEM13.cpp
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
55
74 // Run the example's main under MPI
75 auto meshfile = flit::join(start_dir, "data", "beam-tet.mesh");
76 auto result = flit::call_mpi_main(
77 mfem_13p_main,
78 "mpirun -n 1 --bind-to none",
79 "Mfem13",
80 "--no-visualization --mesh " + meshfile);
tests/MFEM13.cpp
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
56
82 // Output debugging information
83 std::ostream &out = flit::info_stream;
84 out << id << " stdout: " << result.out << "\n";
85 out << id << " stderr: " << result.err << "\n";
86 out << id << " return: " << result.ret << "\n";
87 out.flush();
88
89 if (result.ret != 0) {
90 throw std::logic_error("Failed to run my main correctly");
91 }
tests/MFEM13.cpp
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
57
93 // We will be returning a vector of strings that hold the mesh data
94 std::vector<std::string> retval;
95-111 [...]
112 // Return the mesh and mode files as strings
113 return flit::Variant(retval);
tests/MFEM13.cpp
http://fpanalysistools.org/
Exercise 1 - Create MFEM Test
Finally, we register the test class with FLiT
Now, let’s look at how the FLiT configuration looks
This has config about compilers and the search space
58
116 REGISTER_TYPE(Mfem13)
tests/MFEM13.cpp
exercise-1 $ vim flit-config.toml
http://fpanalysistools.org/
Exercise 1 - FLiT Configuration
59
1 [run]
2 enable_mpi = true
flit-config.toml
http://fpanalysistools.org/
Exercise 1 - FLiT Configuration
Defines the compilations for make dev and make gt
60
4 [dev_build]
5 compiler_name = 'g++'
6 optimization_level = '-O3'
7 switches = '-mavx2 -mfma'
8
9 [ground_truth]
10 compiler_name = 'g++'
11 optimization_level = '-O2'
12 switches = ''
flit-config.toml
http://fpanalysistools.org/
Exercise 1 - FLiT Configuration
61
14 [[compiler]]
15 binary = 'g++-7'
16 name = 'g++'
17 type = 'gcc'
18 optimization_levels = [
19 '-O3',
20 ]
21 switches_list = [
22 '-ffast-math',
23 '-funsafe-math-optimizations',
24 '-mfma',
25 ]
flit-config.toml
http://fpanalysistools.org/
Exercise 1 - FLiT Configuration
62
27 [[compiler]]
28 binary = 'clang++-6.0'
29 name = 'clang++'
30 type = 'clang'
31 optimization_levels = [
32 '-O3',
33 ]
34 switches_list = [
35 '-ffast-math',
36 '-funsafe-math-optimizations',
37 '-mfma',
38 ]
flit-config.toml
http://fpanalysistools.org/
Exercise 1 - Makefile Configuration
A second configuration file: custom.mk
63
exercise-1 $ vim custom.mk
http://fpanalysistools.org/
Exercise 1 - Makefile Configuration
64
4 PACKAGES_DIR := $(abspath ../packages)
5 MFEM_SRC := $(PACKAGES_DIR)/mfem
6 HYPRE_SRC := $(PACKAGES_DIR)/hypre
7 METIS_SRC := $(PACKAGES_DIR)/metis-4.0
8
9 SOURCE :=
10 SOURCE += $(wildcard *.cpp)
11 SOURCE += $(wildcard tests/*.cpp)
12
13 # Compiling all sources of MFEM into the tests takes too long for a tutorial
14 # skip it. Instead, we link in the MFEM static library
15 #SOURCE += $(wildcard ${MFEM_SRC}/fem/*.cpp)
16 #SOURCE += $(wildcard ${MFEM_SRC}/general/*.cpp)
17 #SOURCE += $(wildcard ${MFEM_SRC}/linalg/*.cpp)
18 #SOURCE += $(wildcard ${MFEM_SRC}/mesh/*.cpp)
19
20 # just the one source file to see there is a difference
21 SOURCE += ${MFEM_SRC}/linalg/densemat.cpp # where the bug is
custom.mk
http://fpanalysistools.org/
Exercise 1 - Makefile Configuration
That’s all there is to it
Let’s run it!
65
23 CC_REQUIRED += -I${MFEM_SRC}
24 CC_REQUIRED += -I${MFEM_SRC}/examples
25 CC_REQUIRED += -isystem ${HYPRE_SRC}/src/hypre/include
26
27 LD_REQUIRED += -L${MFEM_SRC} -lmfem
28 LD_REQUIRED += -L${HYPRE_SRC}/src/hypre/lib -lHYPRE
29 LD_REQUIRED += -L${METIS_SRC} -lmetis
custom.mk
http://fpanalysistools.org/