当前位置:网站首页>Go compilation principle series 9 (function inlining)
Go compilation principle series 9 (function inlining)
2022-08-05 11:46:00 【book travel】
前言
The variable capture part of compiler optimization was shared in the previous article,This article shares another aspect of compiler optimization—函数内联.Function inlining means that the contents of a smaller function will be inlined,directly into the caller function,Thereby reducing the overhead of function calls
Overview of function inlining
We know the function calls of every high-level programming language,The cost is related to the need to allocate stack memory for it to store parameters、返回值、局部变量等等,GoThe cost of the function call lies in Parameter and return value stack copy、Small stack register overhead以及Check stack expansion in function prologue(Go语言中的栈是可以动态扩容的,因为GoThe memory on the allocation stack is not incrementally increased,Rather, it is a one-time allocation,This is to avoid accessing out of bounds,It will be allocated all at once,When it is checked that the allocated stack memory is not enough,It will expand a sufficiently large stack space,and copy the contents of the original stack)
Write a piece of code below,通过GoThe benchmark test to measure the efficiency improvement brought by function inlining
import "testing"
//go:noinline //禁用内联.If you want to enable inline,Just uncomment this line
func max(a, b int) int {
if a > b {
return a
}
return b
}
var Result int
func BenchmarkMax(b *testing.B) {
var r int
for i:=0; i< b.N; i++ {
r = max(-1, i)
}
Result = r
}
在编译的过程中,GoThe compiler actually calculates the cost of function inlining,So there are only simple functions,function inlining will be triggered.In the source code implementation of the following function inlining,We can see these situations below不会被内联:
递归函数 The function is preceded by the following comment: go:noinline
、go:norace
、go:nocheckptr
、go:uintptrescapes
等没有函数体 The number of nodes in the abstract syntax tree of the function declaration is greater than5000(我的Go版本是1.16.6)(That is, there are too many statements inside the function,Nor will it be inlined) Functions contain closures(* OCLOSURE
)、range(ORANGE
)、select(OSELECT
)、go(OGO
)、defer(ODEFER
)、type(ODCLTYPE
)、返回值是函数(ORETJMP
*)的,will not inline
We can also build or compile time,Controls whether it can be inlined through parameters.If you don't want all the functions in the program to be inlined
go build -gcflags="-l" xxx.go
go tool compile -l xxx.go
Also we are at compile time,You can also see which functions are inlined,Which functions are not inlined,以及原因是什么
go tool compile -m=2 xxx.go
看一个例子
package main
func test1(a, b int) int {
return a+b
}
func step(n int) int {
if n < 2 {
return n
}
return step(n-1) + step(n-2)
}
func main() {
test1(1, 2)
step(5)
}
可以看到test1This function can be inlined,Because its function body is very simple.stepThis function is a recursive function,So it doesn't do inlining
The underlying implementation of function inlining
In fact, each function call chain is very deep here,I will not explain the meaning of the code line by line here,Only some core methods will be introduced,Interested friends can debug it themselves(There are related articles posted earlier)(Go源码调试方法)
It was mentioned many times beforeGo编译入口文件,You can find this code in the entry file
Go编译入口文件:src/cmd/compile/main.go -> gc.Main(archInit)
// Phase 5: Inlining
if Debug.l != 0 {
// Find functions that can be inlined
visitBottomUp(xtop, func(list []*Node, recursive bool) {
numfns := numNonClosures(list)
for _, n := range list {
if !recursive || numfns > 1 {
caninl(n)
} else {
......
}
inlcalls(n)
}
})
}
for _, n := range xtop {
if n.Op == ODCLFUNC {
devirtualize(n)
}
}
Let's take a look at what each method does
visitBottomUp
该方法有两个参数:
xtop
:I've seen it before,It stores the root node array of the abstract syntax tree of each declaration statement第二个参数是一个函数(The function also has two parameters,One is an array of abstract syntax tree root nodes that satisfy the declaration of a function type,一个是bool值,trueRepresents a recursive function,falseIndicates that it is not a recursive function)
进入到visitBottomUp方法中,You'll find it's mostly traversalxtop,And called on the root node of each abstract syntax treevisit
这个方法(Only for abstract syntax trees that are function type declarations)
func visitBottomUp(list []*Node, analyze func(list []*Node, recursive bool)) {
var v bottomUpVisitor
v.analyze = analyze
v.nodeID = make(map[*Node]uint32)
for _, n := range list {
if n.Op == ODCLFUNC && !n.Func.IsHiddenClosure() { //是函数,and is not a closure function
v.visit(n)
}
}
}
而visit
方法的核心是调用了inspectList
方法,通过inspectList
The abstract syntax tree is traversed in a depth-first search,and take each node as inspectList
方法的第二个参数(是一个函数)的参数,For example, to verify whether there are recursive calls in this function, etc(Specifically belowswitch case)
func (v *bottomUpVisitor) visit(n *Node) uint32 {
if id := v.nodeID[n]; id > 0 {
// already visited
return id
}
......
v.stack = append(v.stack, n)
inspectList(n.Nbody, func(n *Node) bool {
switch n.Op {
case ONAME:
if n.Class() == PFUNC {
......
}
case ODOTMETH:
fn := asNode(n.Type.Nname())
......
}
case OCALLPART:
fn := asNode(callpartMethod(n).Type.Nname())
......
case OCLOSURE:
if m := v.visit(n.Func.Closure); m < min {
min = m
}
}
return true
})
v.analyze(block, recursive)
}
return min
}
By calling latervisitBottomUp
The method passed as the second parameter,Perform inline judgment and inline operations on abstract syntax trees,具体就是caninl
和inlcalls
这两个方法
caninl
The role of this method is to verify whether the abstract syntax tree declared by the function type can be inlined
这个方法的实现很简单,The first is through a lotifThe statement verifies whether there is an image in front of the functiongo:noinline
wait for this mark
func caninl(fn *Node) {
if fn.Op != ODCLFUNC {
Fatalf("caninl %v", fn)
}
if fn.Func.Nname == nil {
Fatalf("caninl no nname %+v", fn)
}
var reason string // reason, if any, that the function was not inlined
......
// If marked "go:noinline", don't inline
if fn.Func.Pragma&Noinline != 0 {
reason = "marked go:noinline"
return
}
// If marked "go:norace" and -race compilation, don't inline.
if flag_race && fn.Func.Pragma&Norace != 0 {
reason = "marked go:norace with -race compilation"
return
}
......
// If fn has no body (is defined outside of Go), cannot inline it.
if fn.Nbody.Len() == 0 {
reason = "no function body"
return
}
visitor := hairyVisitor{
budget: inlineMaxBudget,
extraCallCost: cc,
usedLocals: make(map[*Node]bool),
}
if visitor.visitList(fn.Nbody) {
reason = visitor.reason
return
}
if visitor.budget < 0 {
reason = fmt.Sprintf("function too complex: cost %d exceeds budget %d", inlineMaxBudget-visitor.budget, inlineMaxBudget)
return
}
n.Func.Inl = &Inline{
Cost: inlineMaxBudget - visitor.budget,
Dcl: inlcopylist(pruneUnusedAutos(n.Name.Defn.Func.Dcl, &visitor)),
Body: inlcopylist(fn.Nbody.Slice()),
}
......
}
There is another main method herevisitList
,It is used to verify whether the function we mentioned above is in itgo、select、rangeWait for these sentences.For inline conditions,It will override the function declaration to free the inline field of the syntax tree(Inl
)
inlcalls
This method is the specific inline operation,For example, converting the parameters and return values of the function into declaration statements in the caller, etc.The calls and implementations inside are more complicated,No more sticking code here,You can see for yourself.The core methods of function inlining are in the following files
src/cmd/compile/internal/gc/inl.go
边栏推荐
- Gray value and thermal imaging understanding
- 普通二本毕业八年,京东就职两年、百度三年,分享大厂心得
- 【着色器实现Flicker“DJ”闪烁效果_Shader效果第十五篇】
- 消息中间件汇总
- 解决 cuDNN launch failure 错误
- 【HMS core】【FAQ】Health Kit、Ads kit、push Kit典型问题合集5
- 课表小程序使用攻略
- 2022 极术通讯-基于安谋科技 “星辰” STAR-MC1的灵动MM32F2570开发板深度评测
- Apache APISIX Ingress v1.5-rc1 released
- Linux: Remember to install MySQL8 on CentOS7 (blog collection)
猜你喜欢
五大理由告诉你为什么开发人员选择代码质量静态分析工具Klocwork来实现软件安全
学习用于视觉跟踪的深度紧凑图像表示
ECCV 2022 | 视听分割:全新任务,助力视听场景像素级精细化理解
623. Add a row to a binary tree: Simple binary tree traversal problems
分布式事务解决方案
The principle and application scenario of mysql master-slave synchronization
Official release 2022 Nanjing Zhibo Expo is scheduled to be held in Xinzhuang National Exhibition in October
Qt::qcustomplot 和 qchart数据填充相关
Visit GOPS Long Zhi booth, Forrester's latest report: "the Forrester Wave: the fourth quarter of 2021 enterprise service management report
对于聚合物聚乙二醇PEG大家了解多少了?以及在生活中的应用
随机推荐
女人是这个世界上最美丽的生命
Go编译原理系列9(函数内联)
623. Add a row to a binary tree: Simple binary tree traversal problems
hdu4545 魔法串
nyoj1185最大最小值(线段树)
Version Control | Longzhi invites you to go to the GOPS Global Operation and Maintenance Conference to explore the road of large-scale, agile, high-quality and open software development and operation
[7.29-8.5] Review of wonderful technical blog posts in the writing community
How to write a blog with Golang - Milu.blog development summary
2-2.基金的投资交易与结算
Go Quick Start Guide: Basic Types
163_技巧_Power BI 一键批量建立自定义字段参数
Mysql8基础知识
Android 开发用 Kotlin 编程语言一 基本数据类型
对于聚合物聚乙二醇PEG大家了解多少了?以及在生活中的应用
Keras 分割网络自定义评估函数 - mean iou
解决2022Visual Studio中scanf返回值被忽略问题
【着色器实现Flicker“DJ”闪烁效果_Shader效果第十五篇】
Wingide 快捷键
消息中间件汇总
Android development with Kotlin programming language three loop control