Adding a basic LLVM pass
Published:
Updated:
The SimGrid model checker uses memory introspection (of the heap, stack and global variables) in order to detect the equality of the state of a distributed application at the different nodes of its execution graph. One difficulty is to deal with uninitialised variables. The uninitialised global variables are usually not a big problem as their initial value is 0. The heap variables are dealt with by memset
ing to 0 the content of the buffers returned by malloc
and friends. The case of uninitialised stack variables is more problematic as their value is whatever was at this place on the stack before. In order to evaluate the impact of those uninitialised variables, we would like to clean each stack frame before using them. This could be done with a LLVM plugin. Here is my first attempt to write a LLVM pass to modify the code of a function.
A solution for this, would be to include, at compilation time, instructions to clean the stack frame at the beginning of each function. This could be implemented as a LLVM pass:
This is mostly relevant when the generated code is not optimised. In optimised code, local variables do not need to live on the stack.
Table of content
LLVM overview
A good high level introduction to the LLVM architecture (LLVM IR and passes) can be found in The Architecture of Open Source Applications.
IR generation
LLVM uses an intermediate language, LLVM IR to optimise and generate native code.
For example, a simple hello world like this,
#include <stdio.h>
int main(int argc, char** argv) {
puts("Hello world!");
return 0;
}
is turned into this LLVM IR:
; ModuleID = 'helloworld.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [13 x i8] c"Hello world!\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main(i32 %argc, i8** %argv) #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i8**, align 8
store i32 0, i32* %1
store i32 %argc, i32* %2, align 4
store i8** %argv, i8*** %3, align 8
%4 = call i32 @puts(i8* getelementptr inbounds ([13 x i8]* @.str, i32 0, i32 0))
ret i32 0
}
declare i32 @puts(i8*) #1
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Debian clang version 3.6.0-svn215195-1 (trunk) (based on LLVM 3.6.0)"}
by
clang -S -emit-llvm helloworold.c -o helloworld.ll
The generated LLVM IR can be target-dependant as the type of the variables may depend on the architecture/OS:
- a C
int
is mapped into a LLVMi32
on 32-bit, LLP64 and LP64 system but to ai64
on ILP64; - a C
long
is mapped into ai32
on 32-bit and LLP64 systems but toi64
on LP64 and ILP64.
The initial generation of LLVM IR is not done in LLVM but by the frontend (clang, dragonegg, etc.).
LLVM IR passes
Many LLVM optimisations are implemented in an architecture independant way by IR passes which transform/optimise IR:
opt -std-compile-opts -S helloworld.ll -o helloworld.opt.ll --time-passes 2> opt.log
Generated IR:
; ModuleID = 'helloworld.ll'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [13 x i8] c"Hello world!\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main(i32 %argc, i8** nocapture readnone %argv) #0 {
%1 = tail call i32 @puts(i8* getelementptr inbounds ([13 x i8]* @.str, i64 0, i64 0)) #2
ret i32 0
}
; Function Attrs: nounwind
declare i32 @puts(i8* nocapture readonly) #1
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { nounwind }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Debian clang version 3.6.0-svn215195-1 (trunk) (based on LLVM 3.6.0)"}
CodeGen passes
This optimized LLVM IR is then used to generate assembly/binary code for the target architecture:
llc helloworld.opt.ll -o helloworld.s --time-passes 2> llc.log
Generated assembly:
.text
.file "/home/foo/temp/helloworld.opt.ll"
.globl main
.align 16, 0x90
.type main,@function
main: # @main
.cfi_startproc
# BB#0:
pushq %rbp
.Ltmp0:
.cfi_def_cfa_offset 16
.Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp2:
.cfi_def_cfa_register %rbp
movl $.L.str, %edi
callq puts
xorl %eax, %eax
popq %rbp
retq
.Ltmp3:
.size main, .Ltmp3-main
.cfi_endproc
.type .L.str,@object # @.str
.section .rodata.str1.1,"aMS",@progbits,1
.L.str:
.asciz "Hello world!"
.size .L.str, 13
.ident "Debian clang version 3.6.0-svn215195-1 (trunk) (based on LLVM 3.6.0)"
.section ".note.GNU-stack","",@progbits
Summary
A LLVM based compiler uses the following phases:
- code analysis (preprocessing, lexing, parsing, semantic analysis, etc.);
- LLVM IR generation (by the compiler);
- LLVM IR transformation/optimisation (by applying IR passes);
- native code generation from IR (by applying CodeGen passes).
Steps 1 and 2 are parts of the code of the compiler. Steps 3 and 4 are handled by the LLVM framework (configurable/pluggable by the compiler).
As we want to touch the content of the stack, we want to add a CodeGen pass.
Adding a CodeGen pass
Let's first try to add a pass to insert a NOP into every function.
Header
Let's create a new NoopInserter
pass (NoopInserter.h
). There are many kinds of passes. This pass is a MachineFunction
pass: it is called (runOnMachineFunction
) on each generarated native function and can modify it before it is passed to the next pass.
#include <llvm/PassRegistry.h>
#include <llvm/CodeGen/MachineFunctionPass.h>
namespace llvm {
class NoopInserter : public llvm::MachineFunctionPass {
public:
static char ID;
NoopInserter();
virtual bool runOnMachineFunction(llvm::MachineFunction &Fn);
};
}
The ID
is used as a reference to the pass in LLVM: the value of this variable is not important, only its address is used.
Implementation
#include "NoopInserter.h"
#include <llvm/CodeGen/MachineInstrBuilder.h>
#include <llvm/Target/TargetMachine.h>
#include <llvm/Target/TargetInstrInfo.h>
#include <llvm/PassManager.h>
#include <llvm/Transforms/IPO/PassManagerBuilder.h>
#include <llvm/CodeGen/Passes.h>
#include <llvm/Target/TargetSubtargetInfo.h>
#include "llvm/Pass.h"
#define GET_INSTRINFO_ENUM
#include "../Target/X86/X86GenInstrInfo.inc"
#define GET_REGINFO_ENUM
#include "../Target/X86/X86GenRegisterInfo.inc.tmp"
namespace llvm {
char NoopInserter::ID = 0;
NoopInserter::NoopInserter() : llvm::MachineFunctionPass(ID) {
}
bool NoopInserter::runOnMachineFunction(llvm::MachineFunction &fn) {
const llvm::TargetInstrInfo &TII = *fn.getSubtarget().getInstrInfo();
MachineBasicBlock& bb = *fn.begin();
llvm::BuildMI(bb, bb.begin(), llvm::DebugLoc(), TII.get(llvm::X86::NOOP));
return true;
}
char& NoopInserterID = NoopInserter::ID;
}
using namespace llvm;
INITIALIZE_PASS_BEGIN(NoopInserter, "noop-inserter",
"Insert a NOOP", false, false)
INITIALIZE_PASS_DEPENDENCY(PEI)
INITIALIZE_PASS_END(NoopInserter, "noop-inserter",
"Insert a NOOP", false, false)
The runOnMachineFunction
method finds the beginning of the function and inserts a X86 NOOP instruction. The method returns true
in order to tell the LLVM framework that this function has been modified by this pass. This implementation will only work on X86/AMD64 targets. A real pass should be target independent or at least check the target.
The INITIALIZE_PASS
macros declare the pass and declare its dependencies. Here, we are declaring a dependency on PEI
a.k.a PrologEpilogInserter
which adds the prolog and epilog to the code of native function. Those macros define a function:
void initializeNoopInserterPass(PassRegistry &Registry);
The NoopInserterID
may be used by other passes to refer to this pass.
Declarations
We have to add a few declarations of this pass.
In include/llvm/CodeGen/Passes.h
:
// NoopInserter - This pass inserts a NOOP instruction
extern char &NoopInserterID;
In include/llvm/InitializePasses.h
:
void initializeNoopInserterPass(PassRegistry &Registry)
Registration
The pass must be added in llvm::initializeCodeGen()
lib/CodeGen/CodeGen.cpp
:
initializeNoopInserterPass(Registry);
Result
clang -O3 helloworld.c -S -o-
We have a nice NOOP:
.text
.file "/home/foo/temp/helloworld.c"
.globl main
.align 16, 0x90
.type main,@function
main: # @main
.cfi_startproc
# BB#0: # %entry
nop
pushq %rax
.Ltmp0:
.cfi_def_cfa_offset 16
movl $.L.str, %edi
callq puts
xorl %eax, %eax
popq %rdx
retq
.Ltmp1:
.size main, .Ltmp1-main
.cfi_endproc
.type .L.str,@object # @.str
.section .rodata.str1.1,"aMS",@progbits,1
.L.str:
.asciz "Hello world!"
.size .L.str, 13
.ident "clang version 3.6.0 "
.section ".note.GNU-stack","",@progbits
The program still works:
$ clang -O3 helloworld.c -S -o-
$ ./a.out
Hello world!
Conclusion
I successfully managed to add a pass in order to (actively) do nothing in each generated native function. In the next episode, I will try to do something useful instead.