1 of 33

Working with LLVM passes

Dan Liew

Software Reliability Group

11th June 2014

2 of 33

Getting the sources

I spent quite a bit of time preparing source code that you guys can use to easily play with this stuff. You can find it at

https://github.com/delcypher/srg-llvm-pass-tutorial

You need to LLVM/Clang 3.5 (not released yet so just build from trunk). For your reference I built against r210103

3 of 33

Disclaimer!

  • I am far from being an expert on passes in LLVM!
  • The stuff I’m showing today is stuff I’ve learnt whilst working on LLVM based projects

4 of 33

LLVM passes

Pass

BasicBlockPass

CallGraphSCCPass

FunctionPass

LoopPass

RegionPass

ModulePass

  • Each pass type will iterate over its particular area (e.g. Function pass iterates over Functions that have a body within a module)
  • Passes generally aren’t “supposed” to modify stuff outside the thing they are currently visiting.

5 of 33

Analysis passes

  • These passes (AFAIK) compute various properties on the module without modifying it
  • Other passes can be dependent on these analyses
  • Look in include/llvm/Analysis to find various passes
  • Examples include FindUsedTypes and PostDominatorTree

6 of 33

Transformation passes

  • As the name suggests these passes actually modify the module
  • See the header files in include/llvm/Transforms
  • Examples include DeadInstrElimination, LoopUnrollPass and GlobalDCE

7 of 33

Let’s have a quick play with opt

  • opt is a convenient tool for running passes on LLVM modules.
  • I’ll demonstrate the Internalize and GlobalDCE passes

8 of 33

Stripping out dead functions

#include <stdio.h>

void foo()

{

printf("I'm foo\n");

}

int main()

{

printf("I'm main\n");

return 0;

}

  • Consider the simple program on the left. The function foo() is clearly not reachable. How can we can use the GlobalDCE and Internalize passes to remove it?
  • We’ll run the internalize pass first to internalize every function except main()
  • We’ll then run Global dead code elimination (GlobalDCE)

9 of 33

Stripping out dead functions

$ clang -emit-llvm -S -O0 simple.c

$ cat simple.ll

...

$ opt simple.ll -S -debug-pass=Structure -internalize -internalize-public-api-list=main -globaldce

10 of 33

Other fun things we can do with opt

#include <stdio.h>

int main()

{

for (int index=0; index < 5; ++index)

printf("I'm main\n");

return 0;

}

Let’s write a slightly more complex program so our graphs aren’t boring!

$ clang -S -emit-llvm -O0 loop.c

11 of 33

Control flow graph

$ opt -view-cfg loop.ll

This looks like the head of a loop. Let’s check the dominator tree to check if we have a back-edge (head dominates the tail)

12 of 33

Dominator Tree

$ opt -view-dom loop.ll

Yep we have a

back-edge :)

13 of 33

Region graph

$ opt -view-regions loop.ll

Regions are a set of basic blocks in the CFG that from outside the region look like they have single entry and single exit edges.

14 of 33

Other graphs that opt can produce

  • Post dominance tree
  • Call graph

15 of 33

Quick demo of this

See visualisationDemos/*.sh

16 of 33

Writing a simple function pass

  • I’ve taken the source code for an example Function pass from the LLVM source tree (lib/Transforms/Hello/)
  • We can build it and use it as a plug-in to opt

17 of 33

The code...

STATISTIC(HelloCounter, "Counts number of functions greeted");

namespace {

// Hello - The first implementation, without getAnalysisUsage.

struct Hello : public FunctionPass {

static char ID; // Pass identification, replacement for typeid

Hello() : FunctionPass(ID) {}

bool runOnFunction(Function &F) override {

++HelloCounter;

errs() << "Hello: ";

errs().write_escaped(F.getName()) << '\n';

return false;

}

};

}

char Hello::ID = 0;

static RegisterPass<Hello> X("hello", "Hello World Pass");

This is a function pass

Inform LLVM’s PassManager that we didn’t change anything

Just a bit of boilerplate code

Declare an unsigned int whose value can be shown using -stats in opt

18 of 33

Demo

See helloPass/run.sh

19 of 33

Quick aside: LLVM’s PassManager

  • I won’t demo this but there are uses of this in many projects including KLEE (/lib/Module/Optimize.cpp)
  • It tries to efficiently organise the order that passes are run so Analysis passes are not executed more than necessary.
  • For example our HelloPass doesn’t modify anything so that means there is no need to re-compute the dominator information if a subsequent pass uses it
  • This is the intended way of running passes in your own LLVM based tools

20 of 33

Quick aside: LLVM’s PassManager

21 of 33

Using analyses in your own passes

  • Passes can (and should) explicitly say what analyses they depend on
  • I have an example taken from Bugle (LLVM -> Boogie translator used in GPUVerify) with a few minor modifications.
  • Let’s take a look at the code...

22 of 33

Call graph cycle detection

...

bool runOnModule(llvm::Module &M) override {

CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();

scc_iterator<CallGraph *> i = scc_begin(&CG), e = scc_end(&CG);

bool hasCycle = false;

// Do cycle detect stuff (see the real source code)

return false;

}

virtual void getAnalysisUsage(AnalysisUsage &AU) const override {

AU.setPreservesAll();

// We are explicitly stating that we require this analysis

AU.addRequired<llvm::CallGraphWrapperPass>();

}

};

Depend on CallGraph analysis

Get the CallGraph analysis results

23 of 33

Demo

See usingAnalyses/run.sh

24 of 33

Using the IRBuilder

I don’t alway inline but when I do, I do it by hand...

25 of 33

IRBuilder

  • This is a convenience class in LLVM for adding instructions before a particular instruction or at the end of a basic block
  • If you need to add several instructions in a pass this is the way to do it!
  • This wasn’t documented in LLVM for some reason but I fixed this in r210354

26 of 33

Example: Replace get_global_id

I’m going to make the following transformation.

get_global_id(x) ==

get_local_id(x) + get_group_id(x)*get_local_size(x)

I implemented a pass that replaces all call to get_global_id() to the equivalent form above. Effectively inlining...

27 of 33

Transform

define void @foo(i32* nocapture %A, i32 %x) #0 {

entry:

%call = tail call i32 @get_global_id(i32 %x) #2

%add = add nsw i32 %call, 1

%call1 = tail call i32 @get_global_id(i32 0) #2

%idxprom = sext i32 %call1 to i64

%arrayidx = getelementptr inbounds i32* %A, i64 %idxprom

store i32 %add, i32* %arrayidx, align 4, !tbaa !2

ret void

}

define void @foo(i32* nocapture %A, i32 %x) #0 {

entry:

%rpl.ggi.0. = call i32 @get_group_id(i32 %x)

%rpl.ggi.1. = call i32 @get_local_size(i32 %x)

%rpl.ggi.2. = mul i32 %rpl.ggi.0., %rpl.ggi.1.

%rpl.ggi.3. = call i32 @get_local_id(i32 %x)

%rpl.ggi.result. = add i32 %rpl.ggi.2., %rpl.ggi.3.

%add = add nsw i32 %rpl.ggi.result., 1

%rpl.ggi.0.1 = call i32 @get_group_id(i32 0)

%rpl.ggi.1.2 = call i32 @get_local_size(i32 0)

%rpl.ggi.2.3 = mul i32 %rpl.ggi.0.1, %rpl.ggi.1.2

%rpl.ggi.3.4 = call i32 @get_local_id(i32 0)

%rpl.ggi.result.5 = add i32 %rpl.ggi.2.3, %rpl.ggi.3.4

%idxprom = sext i32 %rpl.ggi.result.5 to i64

%arrayidx = getelementptr inbounds i32* %A, i64 %idxprom

store i32 %add, i32* %arrayidx, align 4, !tbaa !2

ret void

}

OLD

NEW

28 of 33

The IRBuilder bit...

for(std::vector<CallInst*>::iterator CI = foundCalls.begin(), CIE = foundCalls.end(); CI != CIE; ++CI) {

Value* dimension = (*CI)->getArgOperand(0);

IRBuilder<> Builder(*CI);

CallInst* ggiCall = Builder.CreateCall(get_group_idF, dimension, "rpl.ggi.0.");

CallInst* glsCall = Builder.CreateCall(get_local_sizeF, dimension, "rpl.ggi.1.");

Value* mul = Builder.CreateMul(ggiCall, glsCall, "rpl.ggi.2.");

CallInst* gliCall = Builder.CreateCall(get_local_idF, dimension, "rpl.ggi.3.");

Value* result = Builder.CreateAdd(mul, gliCall,"rpl.ggi.result.");

// Replace all uses with

(*CI)->replaceAllUsesWith(result);

(*CI)->eraseFromParent(); // Finally we can remove the CallInst to get_global_id()

}

29 of 33

Demo

See usingIRBuilder/run.sh

30 of 33

Using your passes in clang itself

  • It is possible to load your pass as a plug-in and have it run inside clang.
  • A tiny bit of boilerplate needs to be added to your pass code so it is automatically registered as a pass to run by default
  • You can then invoke clang like this…

$ clang -Xclang -load -Xclang YOUR_PASS.so ...

31 of 33

Demo

See helloPass/run_pass_in_clang.sh

32 of 33

More, you say?

  • I wanted to write a pass and a small runtime library that would instrument code to check for division by zero at runtime
  • Unfortunately I ran out of time. Coding these examples took longer than I anticipated...
  • So this is left as an exercise for the reader ;)

33 of 33

Thanks for listening