当前位置:网站首页>LLVM系列第二十三章:写一个简单的运行时函数调用统计器(Pass)
LLVM系列第二十三章:写一个简单的运行时函数调用统计器(Pass)
2022-08-02 14:07:00 【飞翼剑仆】
系列文章目录
LLVM系列第一章:编译LLVM源码
LLVM系列第二章:模块Module
LLVM系列第三章:函数Function
LLVM系列第四章:逻辑代码块Block
LLVM系列第五章:全局变量Global Variable
LLVM系列第六章:函数返回值Return
LLVM系列第七章:函数参数Function Arguments
LLVM系列第八章:算术运算语句Arithmetic Statement
LLVM系列第九章:控制流语句if-else
LLVM系列第十章:控制流语句if-else-phi
LLVM系列第十一章:写一个Hello World
LLVM系列第十二章:写一个简单的词法分析器Lexer
LLVM系列第十三章:写一个简单的语法分析器Parser
LLVM系列第十四章:写一个简单的语义分析器Semantic Analyzer
LLVM系列第十五章:写一个简单的中间代码生成器IR Generator
LLVM系列第十六章:写一个简单的编译器
LLVM系列第十七章:for循环
LLVM系列第十八章:写一个简单的IR处理流程Pass
LLVM系列第十九章:写一个简单的Module Pass
LLVM系列第二十章:写一个简单的Function Pass
LLVM系列第二十一章:写一个简单的Loop Pass
LLVM系列第二十二章:写一个简单的编译时函数调用统计器(Pass)
LLVM系列第二十三章:写一个简单的运行时函数调用统计器(Pass)
LLVM系列第二十四章:用Xcode编译调试LLVM源码
LLVM系列第二十五章:简单统计一下LLVM源码行数
本文目录
前言
在此记录下基于LLVM写一个简单的函数调用统计器(Pass)的过程,以备查阅。
开发环境的配置请参考第一章 《LLVM系列第一章:编译LLVM源码》。
假设我们要简单地统计一下,代码中每个函数的调用次数。我们可以写一个简单的Pass来做这件事。在本章中,我们考虑的是运行时的函数调用情况,而不是编译时的情况。注意它们的区别,在编译时做统计,我们面对的问题是“静态的”;而在运行时做统计,我们面对的问题是“动态的”。
我们可以尝试遍历代码中的每一个函数,在函数的开头处插入特殊的代码来进行统计。这意味着,我们要修改原来的程序。为简单起见,我们可以为每一个函数都分配一个全局计数器(整数变量),用来统计该函数的执行次数。每当一个函数开始执行时,我们就把其对应的计数器加1。最终,当程序运行结束时,计数器的值就是该函数执行的总次数。
本章我们就来写一个简单的Pass,用来统计每个函数在运行时的调用次数。
一、项目结构
我们把这个简单的项目命名为RunTimeFunctionCallCounter。可以参考LLVM源码中的其它Pass的组织结构,来组织我们自己的代码(示例):
llvm-project/llvm
├── ...
├── lib
│ └── Transforms
│ ├── CMakeLists.txt
│ └── RunTimeFunctionCallCounter
│ ├── CMakeLists.txt
│ └── RunTimeFunctionCallCounter.cpp
└── ...
二、项目细节
1. 程序模块
这个简单的项目只包含了一个模块:
- RunTimeFunctionCallCounter,一个简单的Pass模块,用来统计每个函数在运行时的调用次数。
如上所述,RunTimeFunctionCallCounter将会遍历每一个函数,在函数的开头处插入特殊的代码来进行统计,并将每个函数的执行次数打印出来。
注意,我们需要把RunTimeFunctionCallCounter项目加入到LLVM Transforms父项目中,即指示CMake在编译LLVM源码的同时,也要编译RunTimeFunctionCallCounter项目。
以下是跟项目组织结构相关的部分CMake脚本。
(1) lib/Transforms/RunTimeFunctionCallCounter/CMakeLists.txt文件(示例):
# CMakeLists.txt
add_llvm_library(RunTimeFunctionCallCounter MODULE BUILDTREE_ONLY
RunTimeFunctionCallCounter.cpp
PLUGIN_TOOL
opt
)
(2) lib/Transforms/CMakeLists.txt文件(示例):
...
add_subdirectory(RunTimeFunctionCallCounter)
...
2. Compile-Time Function Call Counter
CompileTimeFunctionCallCounter的实现在文件lib/Transforms/CompileTimeFunctionCallCounter/CompileTimeFunctionCallCounter.cpp中(示例):
// RunTimeFunctionCallCounter.cpp
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/PassManager.h"
#include "llvm/Passes/PassBuilder.h"
#include "llvm/Passes/PassPlugin.h"
#include "llvm/Transforms/Utils/ModuleUtils.h"
#include <iostream>
#include <map>
using namespace llvm;
#define DEBUG_TYPE "runtime-function-call-counter"
namespace helper
{
Constant* CreateGlobalVariable(Module& module, StringRef globalVariableName);
bool CountFunctionCallsInModule(Module& module);
} // namespace helper
// A LLVM Pass to count the function calls at runtime
class RunTimeFunctionCallCounter : public PassInfoMixin<RunTimeFunctionCallCounter>
{
public:
PreservedAnalyses run(Module& module, ModuleAnalysisManager&)
{
bool changed = helper::CountFunctionCallsInModule(module);
return changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}
};
// Pass registration
extern "C" LLVM_ATTRIBUTE_WEAK ::PassPluginLibraryInfo llvmGetPassPluginInfo()
{
return {
LLVM_PLUGIN_API_VERSION, "Run-Time Function Call Counter", LLVM_VERSION_STRING, [](PassBuilder& passBuilder) {
passBuilder.registerPipelineParsingCallback(
[](StringRef name, ModulePassManager& passManager, ArrayRef<PassBuilder::PipelineElement>) {
if (name == "runtime-function-call-counter")
{
passManager.addPass(RunTimeFunctionCallCounter());
return true;
}
return false;
});
}};
}
Constant* helper::CreateGlobalVariable(Module& module, StringRef globalVariableName)
{
auto& context = module.getContext();
// This will insert a declaration into module
Constant* newGlobalVariable = module.getOrInsertGlobal(globalVariableName, IntegerType::getInt32Ty(context));
// This will change the declaration into definition (and initialize to 0)
GlobalVariable* initializedGlobalVariable = module.getNamedGlobal(globalVariableName);
initializedGlobalVariable->setLinkage(GlobalValue::CommonLinkage);
initializedGlobalVariable->setAlignment(MaybeAlign(4));
initializedGlobalVariable->setInitializer(ConstantInt::get(context, APInt(32, 0)));
return newGlobalVariable;
}
bool helper::CountFunctionCallsInModule(Module& module)
{
auto& context = module.getContext();
// Function name to IR variable map that holds the call counters
StringMap<Constant*> callCounterMap;
// Function name to IR variable map that holds the function names
StringMap<Constant*> functionNameMap;
// Step 1. For each function in the module, inject the code for call-counting
for (Function& function : module)
{
if (function.isDeclaration())
{
continue;
}
// Get an IR builder and set the insertion point to the top of the function
IRBuilder<> counterBuilder(&*function.getEntryBlock().getFirstInsertionPt());
// Create a global variable to count the calls to this function
std::string counterName = "counter_" + function.getName().str();
Constant* counterVariable = helper::CreateGlobalVariable(module, counterName);
callCounterMap[function.getName()] = counterVariable;
// Create a global variable to hold the name of this function
Constant* functionName = counterBuilder.CreateGlobalStringPtr(function.getName(), "name_" + function.getName());
functionNameMap[function.getName()] = functionName;
// Inject instruction to increment the call count each time this function executes
LoadInst* counteCurrentValue = counterBuilder.CreateLoad(counterVariable);
Value* counterNextValue = counterBuilder.CreateAdd(counterBuilder.getInt32(1), counteCurrentValue);
counterBuilder.CreateStore(counterNextValue, counterVariable);
// Let the opt tool print out some debug information
// (Visible only if we pass "-debug" to the command and have an assert build.)
LLVM_DEBUG(dbgs() << "Instrumented: " << function.getName() << "\n");
}
// Stop here if there is no function definition in this module
if (callCounterMap.size() == 0)
{
return false;
}
// Step 2. Inject the declaration of "printf()"
//
// Create (or get) the following declaration in the IR module:
// declare i32 @printf(i8*, ...)
//
// It corresponds to the following C declaration:
// int printf(char*, ...)
PointerType* printfArgType = PointerType::getUnqual(Type::getInt8Ty(context));
FunctionType* printfFunctionType = FunctionType::get(IntegerType::getInt32Ty(context),
printfArgType,
/*IsVarArgs=*/true);
FunctionCallee printfCallee = module.getOrInsertFunction("printf", printfFunctionType);
// Step 3. Inject a global variable that will hold the printf format string
Constant* formatString = ConstantDataArray::getString(context, "Function: %s, called %d times\n");
Constant* formatStringVariable = module.getOrInsertGlobal("", formatString->getType());
dyn_cast<GlobalVariable>(formatStringVariable)->setInitializer(formatString);
// Step 4. Define a printf wrapper that will print the results
//
// Define `PrintfWrapper` that will print the results stored in functionNameMap
// and callCounterMap. It is equivalent to the following C++ function:
// ```
// void PrintfWrapper()
// {
// for (auto &item : functions)
// {
// printf("Function: %s, called %d times\n", item.name, item.count);
// }
// }
// ```
// ("item.name" comes from functionNameMap, "item.count" comes from callCounterMap.)
FunctionType* printfWrapperType = FunctionType::get(Type::getVoidTy(context),
{
},
/*IsVarArgs=*/false);
Function* printfWrapperFunction =
dyn_cast<Function>(module.getOrInsertFunction("PrintfWrapper", printfWrapperType).getCallee());
BasicBlock* enterBlock = BasicBlock::Create(context, "enter", printfWrapperFunction);
IRBuilder<> printfWrapperBuilder(enterBlock);
Value* formatStringPtr = printfWrapperBuilder.CreatePointerCast(formatStringVariable, printfArgType);
for (auto& item : callCounterMap)
{
Constant* functionName = functionNameMap[item.first()];
LoadInst* counterValue = printfWrapperBuilder.CreateLoad(item.second);
printfWrapperBuilder.CreateCall(printfCallee, {
formatStringPtr, functionName, counterValue});
}
printfWrapperBuilder.CreateRetVoid();
// Step 5. Call `PrintfWrapper` at the very end of this module
appendToGlobalDtors(module, printfWrapperFunction, /*Priority=*/0);
return true;
}
三、编译
1. 生成项目文件
用CMake工具生成项目文件(示例):
cd /path/to/llvm-project
mkdir build
cd build
cmake -G Ninja -DLLVM_ENABLE_PROJECTS=clang ../llvm
输出log如下(示例):
-- clang project is enabled
-- clang-tools-extra project is disabled
-- ...
-- Ninja version: 1.10.2
-- Found ld64 - /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld
-- ...
-- LLVM host triple: x86_64-apple-darwin20.6.0
-- LLVM default target triple: x86_64-apple-darwin20.6.0
-- ...
-- Configuring done
-- Generating done
-- Build files have been written to: .../llvm-project/build
2. 编译
用ninja进行编译(示例):
ninja
如果我们是在参考第一章的步骤,编译了LLVM源码之后,再编译此项目,则只需编译RunTimeFunctionCallCounter项目即可。当然,这是ninja自动就能识别出来的,即所谓的增量编译技术。输出log如下(示例):
[4/4] Linking CXX shared module lib/RunTimeFunctionCallCounter.dylib
3. 运行
为了简单起见,我们就用以下Test.c文件中C代码来测试一下(示例):
// Test.c
void Foo()
{
}
void Bar()
{
Foo();
}
void Fez()
{
Bar();
}
int main()
{
Foo();
Bar();
Fez();
for (int i = 0; i < 5; i++)
{
Foo();
}
return 0;
}
可以用clang生成IR代码,命令如下(示例):
mv ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/Test.c.txt ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/Test.c
clang -S -emit-llvm ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/Test.c -o ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/Test.ll
生成IR代码如下(示例):
; ModuleID = '../llvm/lib/Transforms/CompileTimeFunctionCallCounter/Tests/Test.c'
source_filename = "../llvm/lib/Transforms/CompileTimeFunctionCallCounter/Tests/Test.c"
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx11.0.0"
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local void @Foo() #0 {
entry:
ret void
}
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local void @Bar() #0 {
entry:
call void @Foo()
ret void
}
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local void @Fez() #0 {
entry:
call void @Bar()
ret void
}
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%i = alloca i32, align 4
store i32 0, i32* %retval, align 4
call void @Foo()
call void @Bar()
call void @Fez()
store i32 0, i32* %i, align 4
br label %for.cond
for.cond: ; preds = %for.inc, %entry
%0 = load i32, i32* %i, align 4
%cmp = icmp slt i32 %0, 5
br i1 %cmp, label %for.body, label %for.end
for.body: ; preds = %for.cond
call void @Foo()
br label %for.inc
for.inc: ; preds = %for.body
%1 = load i32, i32* %i, align 4
%inc = add nsw i32 %1, 1
store i32 %inc, i32* %i, align 4
br label %for.cond, !llvm.loop !3
for.end: ; preds = %for.cond
ret i32 0
}
attributes #0 = {
noinline nounwind optnone ssp uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{
!0, !1}
!llvm.ident = !{
!2}
!0 = !{
i32 1, !"wchar_size", i32 4}
!1 = !{
i32 7, !"PIC Level", i32 2}
!2 = !{
!"clang version 12.0.1 (https://github.com/llvm/llvm-project fed41342a82f5a3a9201819a82bf7a48313e296b)"}
!3 = distinct !{
!3, !4}
!4 = !{
!"llvm.loop.mustprogress"}
运行RunTimeFunctionCallCounter(示例):
./bin/opt -debug -load-pass-plugin=lib/RunTimeFunctionCallCounter.dylib -passes="runtime-function-call-counter" ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/Test.ll -o Test.bin
输出log如下(示例):
Args: ./bin/opt -debug -load-pass-plugin=lib/RunTimeFunctionCallCounter.dylib -passes=runtime-function-call-counter ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/Test.ll -o Test.bin
Instrumented: Foo
Instrumented: Bar
Instrumented: Fez
Instrumented: main
注意,这里生成了一个测试程序Test.bin,我们需要运行该测试程序来进行“运行时”的统计工作(示例):
./bin/lli ./Test.bin
输出结果如下(示例):
Function: Foo, called 8 times
Function: Bar, called 2 times
Function: Fez, called 1 times
Function: main, called 1 times
4. Instrument (侵入式分析)
我们之所以把本章的Pass又称为Instrument,是因为它对原来的程序具有一定的“侵入式”的影响,它在执行分析的过程中改变了原来的程序。
我们可以用llvm-dis工具来把我们的测试程序对应的IR代码反编译出来(示例):
./bin/llvm-dis Test.bin -o ../llvm/lib/Transforms/RunTimeFunctionCallCounter/Tests/TestWithCounter.ll
生成IR代码如下(示例):
; ModuleID = 'Test.bin'
source_filename = "../llvm/lib/Transforms/CompileTimeFunctionCallCounter/Tests/Test.c"
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx11.0.0"
@counter_Foo = common global i32 0, align 4
@name_Foo = private unnamed_addr constant [4 x i8] c"Foo\00", align 1
@counter_Bar = common global i32 0, align 4
@name_Bar = private unnamed_addr constant [4 x i8] c"Bar\00", align 1
@counter_Fez = common global i32 0, align 4
@name_Fez = private unnamed_addr constant [4 x i8] c"Fez\00", align 1
@counter_main = common global i32 0, align 4
@name_main = private unnamed_addr constant [5 x i8] c"main\00", align 1
@0 = global [31 x i8] c"Function: %s, called %d times\0A\00"
@llvm.global_dtors = appending global [1 x {
i32, void ()*, i8* }] [{
i32, void ()*, i8* } {
i32 0, void ()* @PrintfWrapper, i8* null }]
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local void @Foo() #0 {
entry:
%0 = load i32, i32* @counter_Foo, align 4
%1 = add i32 1, %0
store i32 %1, i32* @counter_Foo, align 4
ret void
}
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local void @Bar() #0 {
entry:
%0 = load i32, i32* @counter_Bar, align 4
%1 = add i32 1, %0
store i32 %1, i32* @counter_Bar, align 4
call void @Foo()
ret void
}
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local void @Fez() #0 {
entry:
%0 = load i32, i32* @counter_Fez, align 4
%1 = add i32 1, %0
store i32 %1, i32* @counter_Fez, align 4
call void @Bar()
ret void
}
; Function Attrs: noinline nounwind optnone ssp uwtable
define dso_local i32 @main() #0 {
entry:
%0 = load i32, i32* @counter_main, align 4
%1 = add i32 1, %0
store i32 %1, i32* @counter_main, align 4
%retval = alloca i32, align 4
%i = alloca i32, align 4
store i32 0, i32* %retval, align 4
call void @Foo()
call void @Bar()
call void @Fez()
store i32 0, i32* %i, align 4
br label %for.cond
for.cond: ; preds = %for.inc, %entry
%2 = load i32, i32* %i, align 4
%cmp = icmp slt i32 %2, 5
br i1 %cmp, label %for.body, label %for.end
for.body: ; preds = %for.cond
call void @Foo()
br label %for.inc
for.inc: ; preds = %for.body
%3 = load i32, i32* %i, align 4
%inc = add nsw i32 %3, 1
store i32 %inc, i32* %i, align 4
br label %for.cond, !llvm.loop !3
for.end: ; preds = %for.cond
ret i32 0
}
declare i32 @printf(i8*, ...)
define void @PrintfWrapper() {
enter:
%0 = load i32, i32* @counter_Foo, align 4
%1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([31 x i8], [31 x i8]* @0, i32 0, i32 0), i8* getelementptr inbounds ([4 x i8], [4 x i8]* @name_Foo, i32 0, i32 0), i32 %0)
%2 = load i32, i32* @counter_Bar, align 4
%3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([31 x i8], [31 x i8]* @0, i32 0, i32 0), i8* getelementptr inbounds ([4 x i8], [4 x i8]* @name_Bar, i32 0, i32 0), i32 %2)
%4 = load i32, i32* @counter_Fez, align 4
%5 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([31 x i8], [31 x i8]* @0, i32 0, i32 0), i8* getelementptr inbounds ([4 x i8], [4 x i8]* @name_Fez, i32 0, i32 0), i32 %4)
%6 = load i32, i32* @counter_main, align 4
%7 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([31 x i8], [31 x i8]* @0, i32 0, i32 0), i8* getelementptr inbounds ([5 x i8], [5 x i8]* @name_main, i32 0, i32 0), i32 %6)
ret void
}
attributes #0 = {
noinline nounwind optnone ssp uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{
!0, !1}
!llvm.ident = !{
!2}
!0 = !{
i32 1, !"wchar_size", i32 4}
!1 = !{
i32 7, !"PIC Level", i32 2}
!2 = !{
!"clang version 12.0.1 (https://github.com/llvm/llvm-project fed41342a82f5a3a9201819a82bf7a48313e296b)"}
!3 = distinct !{
!3, !4}
!4 = !{
!"llvm.loop.mustprogress"}
注意到,跟原来的IR代码相比,每个函数中都多出了如下代码:
%0 = load i32, i32* @counter_xxxx, align 4
%1 = add i32 1, %0
store i32 %1, i32* @counter_xxxx, align 4
以上IR代码的作用就是把计数器(整数变量)加1。在我们写的Pass中,我们用的是以下C++代码来生成这段IR代码的:
// Inject instruction to increment the call count each time this function executes
LoadInst* counteCurrentValue = counterBuilder.CreateLoad(counterVariable);
Value* counterNextValue = counterBuilder.CreateAdd(counterBuilder.getInt32(1), counteCurrentValue);
counterBuilder.CreateStore(counterNextValue, counterVariable);
四、总结
我们用LLVM提供的C++ API,写了一个简单的Pass,用来统计每个函数在运行时的调用次数,并且做了测试。完整源码示例请参看:
https://github.com/wuzhanglin/llvm-pass-examples
边栏推荐
猜你喜欢
随机推荐
spark写sql的方式
Flask contexts, blueprints and Flask-RESTful
宏定义问题记录day2
[VCU] Detailed S19 file (S-record)
redis基础
Enhanced Apktool reverse artifact
Eslint规则大全
详解RecyclerView系列文章目录
华为路由交换
Flask-SQLAlchemy
C语言初级—从键盘接收一个整形并逆序输出
【c】大学生在校学习c语言常见代码
verilog学习|《Verilog数字系统设计教程》夏宇闻 第三版思考题答案(第九章)
什么是闭包?闭包的作用?闭包的应用?有什么缺点?
两个surfaceview的重叠效果类似直播效果中的视频和讲义实践
不可不知的反汇编相关知识
jwt (json web token)
Flink依赖汇总
C语言待解决
电商项目常见连续登录,消费,日期等问题