Implement a basic block profiler and an edge profiler, and profile information loader for each of them.
Profiling is a form of dynamic program analysis that measures dynamic behaviors of a program. Dynamic analysis is the testing and evaluation of an application during runtime, while static analysis is the testing and evaluation of an application by examining the code without executing the application. Counting the number of instructions existing in a compiled binary file is one example of static analysis, while counting the number of executed instructions when you actually run the program belongs to the category of dynamic analysis.
Code instrumentation is one way of doing profiling. Programmers can insert additional auxiliary instructions that do certain tasks when they are being executed. For example, if you want to count the number of ‘add’ instructions executed at runtime, you can insert an auxiliary instruction that increments the ‘add’ counter after each ‘add’ instruction existing in the program. Then the counter will be incremented whenever an ‘add’ instruction is executed.
In all the previous assignments, source files in one of subdirectories in tools/ are linked together with other library source files in various subdirectories of lib/ to produce an executable file (lexer, parser, tchecker, and codegen). From this assignment, you will implement an LLVM Pass, which is basically a shared library file that can be loaded at runtime from opt tool. opt takes either the LLVM assembly language format (.ll) or the LLVM bitcode format (.bc) as input and the LLVM bitcode format (.bc) as output. (.ll file can be the output format too if you use -S option) opt transforms the LLVM bitcode by applying several LLVM passes to a given input.
Some passes perform the transformations and optimizations that make up the compiler, and others do not modify the program itself but build the analysis results that are used by other transformations. An LLVM pass will be compiled as a shared library (.so), which can be loaded from opt if you specify the corresponding command-line argument.
Let’s use opt using a simple program. This is the example program we saw in the Assignment 4:
#include <stdio.h> int main() { int i = 0, sum = 0; for (i = 0; i <= 10; i++) sum += i; printf("%d\n", sum); return 0; } |
Compile this program to LLVM IR using this command:
$ clang -O0 -emit-llvm -S sum.c |
This will produce unoptimized sum.ll file. All local variables are stored on the stack even if their addresses are not taken. Let’s apply -mem2reg pass, which promotes memory references to be register references.
$ opt -mem2reg -S -o sum_mem2reg.ll sum.ll |
opt can take either .ll or .bc file as an input. Here the output file is also in .ll format, but if we don’t use -S option, the output will be the bitcode format (.bc). These two formats are semantically equivalent, so you can think of .ll file as a readable representation of .bc file. llvm-dis and llvm-as can convert .bc to .ll and vice versa respectively.
Anyway, let’s compare the original unoptimized sum.ll and the sum_mem2reg.ll file we just created. You will notice all those local variables located on the stack using alloca instructions are now in registers, which can be much faster.
The execution order of passes basically follows their specified order in the command line arguments. So, if we give options as follows:
$ opt -somepass -someotherpass ... |
-somepass will be applied before -someotherpass. But the whole order of execution of passes can be changed by other factors too and is ultimately controlled by the LLVM pass manager. On the other hand, you can use some predefined options such as -O0 ~ -O3; they are not pass names themselves. If you use these, up to hundreds of predefined set of passes will be applied based on the specified options. The list of some of passes included in LLVM is here (Note that it is not a complete list of passes).
-mem2reg and many other passes are already included in LLVM’s standard pass set, so you can directly use the their command-line options to opt. But you need to use -load options before the option name to load a specific .so file to use non-standard pass you create. For example, if you create a pass named -mypass which is in MyDir/MyPass.so opt, the options will be:
$ opt -load MyDir/MyPass.so -mypass ... |
Now it is time to learn how to write your own pass. Read this Writing an LLVM pass document; understanding this is crucial for this assignment. You don’t need to know about analysis groups in detail for now though. ‘hello’ example pass in the tutorial is in $LLVM_SRC_ROOT/lib/Transforms/Hello for you to see what it’s like, although it is not included in the standard set of LLVM passes.
Implement a Basic Block / CFG Edge Profiler and Profiler Information Loader
From this assignment, you are not going to make a new tool (=executable), so you don’t need to copy or symlink fun.l and fun.y anymore.
Adding counter-incrementing instructions in EdgeProfiler is trickier than the case of BBProfiler, because CFG edges are not something that actually exist in LLVM bitcode. In a function, if you can go from basic block BB1 to basic block BB2 in a program execution (either by a branch instruction or a fall-through), it is considered that there is an edge from BB1 to BB2. But a counter-incrementing instruction should belong to some basic block, because it is an instruction itself.
We will present you an algorithm to do this here. But to make this algorithm work, there should be no critical edges in the program. A critical edge is an edge which is neither the only edge leaving its source block, nor the only edge entering its destination block.
These edges must be split: a new block must be created in the middle of the edge, in order to insert computations on the edge without affecting any other edges. Even though this is a simple process, fortunately, this is not what you have to do in this assignment; -break-crit-edges pass in the LLVM will do that for you. As long as we use our passes after -break-crit-edges, we can assume there are no critical edges in an input program.
Assuming that there are no critical edges left, the algorithm can be as follows:
for each basic block BB for each unique successor SUCC of BB EDGE = CFG edge from BB to SUCC if (SUCC is the only successor of BB) insert a counter-incrementing inst for EDGE at the end of BB (but before a terminator inst) else insert a counter-incrementing inst for EDGE at the start of SUCC (but after all PHI insts) |
Think about why this works for the edge profiler and why you need critical edges to be split to make this work, and include your reasoning in your README.
What you basically have to do is to implement BBProfiler::runOnModule and EdgeProfiler::runOnModule methods. Of course you can add any additional member/non-member functions or variables as long as you implement the given functionalities correctly. You can modify BBProfiler.cpp/BBProfiler.h and EdgeProfiler.cpp/EdgeProfiler.h as much as you wish to implement them, but don’t delete predefined command-line options or methods; deleting them can result in compilation or autograder failures.
If you want to separate files to add some utility/helper functions required from both classes, lib/Profile/ProfileUtils.cpp and lib/Profile/ProfileUtils.h are places for that. In case you add functions to these files, you should submit these to the dropbox too. Please don’t make any more additional files; you can’t submit those other files.
One more thing you need to do is you need to set ‘numBBs’ in BBProfiler and ‘numEdges’ in EdgeProfiler to correct values. Refer to their STATISTIC(...) definitions in cpp files. They are macros to collect statistics for a pass; this section in the Programmer’s Manual explains what they are and why we need them. You can treat both of them as integers and assign values to them or increment their values freely.
You should implement BBProfileLoader::getCount, EdgeProfileLoader::getCount, EdgeProfileLoader::getWeight functions using your loaded information. getCount function returns the execution count for the given basic block or CFG edge, and getWeight function returns the weight for the given edge. If an edge goes from BB1 to BB2, the weight of a single edge is computed as:
Edge BB1->BB2’s weight = execution count for the edge BB1->BB2 / sum of execution counts for all edges from BB1 |
For example, if BB1 has three successors BB2, BB3, and BB4, and BB1->BB2 is executed 10 times, BB1->BB3 5 times, BB1->BB4 5 times, the weight of BB1->BB2 is 10 / (10+5+5) = 0.5. The possible range of a weight is [0, 1] (includes both 0 and 1).
These functions will be the interface between the loaders and the profile information printer, the program that gets information from BB/edge profile loaders you implement and prints the information to a text file. The profile information printer is implemented in lib/Profile/ProfilePrinter.cpp/h, and these are not the files you need to modify. One thing to note is you should not modify the given bitcode files in your loaders.
As in BB/Edge profilers, you need to set ‘numBBs’ and ‘numEdges’ variables in BBProfileLoader and EdgeProfileLoaders classes too.
If your program compiles successfully, there will be Debug+Asserts/lib/Profile.so. Now you can use your library to instrument any LLVM bitcode file (.bc) or LLVM assembly file (.ll), as long as they contain all the functions of the program. Let’s first test with our proverbial all.fun.
Before applying our instrumentation passes, we need to apply -break-crit-edges first to remove all critical edges. Here we name critical-edge-removed file ‘bench.ll’.
This creates an instrumented LLVM assembly code bench.prof.ll. -stats does not affect the execution of the pass, but shows you execution stats and you can see if your stats counting is working correctly. -debug-only=cfg-profiler will print debug messages enclosed in DEBUG macro only in your profiler/loader passes. Here we used -bb-profiler and -edge-profiler together so your instrumented code will contain instrumentations for both of them. To enable this, please make sure the names of global counter arrays in two profilers are different. You can test each of them separately when you are debugging, but your passes should work together in this way too.
You can compile and run this program:
If your program runs correctly, you will notice there is bb_info.prof and edge_info.prof generated in your current directory. If you want to use file names other than bb_info.prof/edge_info.prof, you can give that as command-line options (refer to options in .cpp files). You can also specify other output .ll filename using -o option.
You can even prepend -break-crit-edges to this option, but we don’t recommend it because explicitly generating critical-edge-removed .ll files will be much convenient for your debugging.
This will first run your BBProfileLoader and EdgeProfileLoader to read the information in bb_info.prof and edge_info.prof, and run ProfilePrinter to print that information to a file named prof_dump.txt. -bb-profile-loader and -edge-profile-loader options can actually be omitted here because ProfilePrinter pass ‘requires’ both of them to be run before it; see ProfilePrinter::getAnalysis method for why it works. This section provides a background explanation. And we have disabled the output because we are not modifying a program at all in all these loaders and printers.
The filename ‘prof_dump.txt’ can also be changed by a command-line option defined in ProfilePrinter.cpp.
‘prof_dump.txt’ contains a human-readable representation of the information loaded by your basic block/edge profile information loaders. Check if the information written in prof_dump.txt is correct.
But as you have noticed, all these commands are very tedious to enter manually every time. We provide you with three benchmarks: sum (simple program that sums the numbers from 1 to N), and wc, and yacc.
Here replace BENCH_NAME with a real directory name. This will do all the steps listed above for you - This will merge source files into one, break critical edges, instrument the program using your profilers, execute the instrumented program and check if the result is correct, and then run the profile information printer to generate ‘prof_dump.txt’. You can do ‘make clean’ to clean all the generated files.
Now you can even add several more programs to benches/ directory to test. What you need to do is make a similar directory structure as sum/ or wc/, and make benches/BENCH_NAME/exec_info file containing the command-line arguments and reference output.
It is totally OK if you have not modified some header files among these. But please submit all these files anyway.
And in case you modified ProfileUtils.cpp and ProfileUtils.h, you should submit these too.
Below is the list of miscellaneous tips and tricks. The listed order is not the order of importance.
Note that this is only applicable if you use the LLVM version 3.5 and compile the code with the same options as in the provided Makefile.
You can find all the related documentations in the LLVM 3.5 documentation page.
Available from: Tuesday, 2 April 2015
Due date: Tuesday, 21 April 2015, midnight