当前位置:网站首页>Compiler optimization (4): inductive variables
Compiler optimization (4): inductive variables
2022-07-07 19:52:00 【openEuler】
0. Basic knowledge inventory
0.1 loop (loop)
Definition
loop(llvm It is understood as natural loop) Is defined in CFG A node set in L, And has the following properties [1][2]:
There is a single entry node ( be called header), This node governs loop All the nodes in ; There is a back edge that enters the loop head ;
Related terms
entering block: A non loop The inner node has an edge connected to loop. When there is only one entering block And only one side of it is connected to header, be called preheader; Act as not loop Nodal peheader Dominate the whole loop; latch: There is an edge connected to header; backedge: It's called back side , One from latch To header The edge of ; exiting edge: One side from loop Inward to loop Outside , The starting point of the edge is called exiting block, The target node is called exit block;

In the right picture above , The yellow area is a loop, The red area is not , Why? ?
Because the red area a and c Are all entry nodes , Does not satisfy the nature of a single entry node .
0.2 Scalar Evolution(SCEV)
Definition
SCEV It is the optimization of the compiler to analyze variables ( Often only for integer types ), It is mainly used to analyze how variables are updated in the loop , Then optimize according to this information .
Loop chain
As shown in the figure , Inductive variables in the loop var Starting at start, The way of iteration is ϕ, In steps of step;

Its circular chain (chrec,Chains of Recurrences) as follows :
var = {start, ϕ , step}
// ϕ∈{+,∗}
// start: starting value
// step: step in each iteration
for instance :
int m = 0;
for (int i = 0; i < n; i++) {
m = m + n;
*res = m;
}
that m The cycle chain of is :m = {0,+,n}.
1. Induction Variable( Inductive variables )
1.1 Definition
Each iteration of the loop increases or decreases a fixed amount of variables , Or another linear function of inductive variables .
for instance [3], In the following cycle i and j Are inductive variables :
for (i = 0; i < 10; ++i) {
j = 17 * i;
}
1.2 benefit
Summarize the benefits of variable optimization , There are but not limited to the following points :
Replace the original calculation method with simpler instructions .
such as , Inductive variables are identified in the above example , Replace the corresponding multiplication with a less expensive addition .j = -17;
for (i = 0; i < 10; ++i) {
j = j + 17;
}Reduce the number of inductive variables , Reduce register pressure .
extern int sum;
int foo(int n) {
int i, j;
j = 5;
for (i = 0; i < n; ++i) {
j += 2;
sum += j;
}
return sum;
}Current loop There are two inductive variables :i、j, Use one of the variables to express the other post , as follows :
extern int sum;
int foo(int n) {
int i;
for (i = 0; i < n; ++i) {
sum += 5 + 2 * (i + 1);
}
return sum;
}Inductive variable substitution , Make the relationship between variables and circular indexes clear , It is convenient for other optimization analysis ( Such as dependency analysis ). Examples are as follows , take c Expressed as a function related to circular index :
int c, i;
c = 10;
for (i = 0; i < 10; i++) {
c = c + 5; // c is incremented by 5 for each loop iteration
}Convert to :
int c, i;
c = 10;
for (i = 0; i < 10; i++) {
c = 10 + 5 * (i + 1); // c is explicitly expressed as a function of loop index
}
2. practice
2.1 Related compilation options
| compiler | option |
|---|---|
| gcc | -fivopt |
| Bi Sheng | -indvars |
2.2 Optimize use cases
Optimization of inductive variables (ivs) stay llvm The position in is :llvm\lib\Transforms\Scalar\IndVarSimplify.cpp
Let's pass a use case , Take a look at the optimization process of Bisheng compiler .
Here's the picture , Suppose that func The inner part is the code to be optimized , below func Inside is the expected result :

its IR Use cases test.ll yes :

The compile command is :
opt test.ll -indvars -S
In the current example ,header、latch and exiting block It's all the same BB, namely bb5.

Step one : basis def-use Relationship , Traverse loop Of ExitBlock in phi The source of the operand of the node , Calculate the final value and replace it , Then replace the phi Use of nodes .
In the example , Calculation %tmp2.lcssa , Its only operand is %tmp2 = add nuw nsw i32 %i.01.0, 3 , Where the expression is located loop yes bb5, here %tmp2 The cycle chain of is
%tmp2 = {3,+,3}<nuw><nsw><%bb5>
Get current loop The maximum value of not exiting the loop is 199999, Now %tmp2=add(3, mul(3,199999))=600000; Next, we will see that the current replacement is not expensive ( The calculation of cost will vary according to different architectures ), At the same time phi Nodal user Replace the value in . The optimization results are as follows :

Step two : Traverse ExitingBlock , Calculate the jump condition , basis def-use The relationship between , Delete the corresponding instruction .
In the example , To calculate the br i1 %0, label %bb5, label %bb7 Of %0 yes false, After the jump instruction is replaced ,%0 = icmp ult i32 %tmp4,200000 non-existent user, Add it to “ Dead order ” in . The optimization results are as follows :

Step three : Delete all “ Dead order ”, And see if his operands should be deleted .
In the example , As %0 Of operands %tmp4 And others user %x.03.0, So it can't be regarded as “ Dead order ” Be deleted . The optimization results are as follows :

Step four : Delete HeaderBlock Medium “ die ”phi node .
In the example , %tmp4 and phi node %x.03.0 It forms a cycle without results , Will delete them , Delete... In the same way %tmp2 and %i.01.0 . The optimization results are as follows :

Reference resources
[1] https://llvm.org/docs/LoopTerminology.html
[2] 《 Compiler principle 》 [ beautiful ]Alfred V.Aho,[ beautiful ]Monica S.Lam,[ beautiful ]Ravi Sethi Waiting , Zhao Jianhua , Translated by Zheng Tao, et al
[3] https://en.wikipedia.org/wiki/Induction_variable


Click on Read the original Start using Bisheng compiler
This article is from WeChat official account. - openEuler(openEulercommunity).
If there is any infringement , Please contact the [email protected] Delete .
Participation of this paper “OSC Source creation plan ”, You are welcome to join us , share .
边栏推荐
- Number - number (Lua)
- 华南X99平台打鸡血教程
- CMD command enters MySQL times service name or command error (fool teaching)
- 编译器优化那些事儿(4):归纳变量
- 怎么在手机上买股票开户 股票开户安全吗
- Solve the error reporting problem of rosdep
- R language dplyr package mutate_ At function and min_ The rank function calculates the sorting sequence number value and ranking value of the specified data column in the dataframe, and assigns the ra
- PMP practice once a day | don't get lost in the exam -7.7
- PV static creation and dynamic creation
- 杰理之发起对耳配对、回连、开启可发现、可连接的轮循函数【篇】
猜你喜欢

The project manager's "eight interview questions" is equal to a meeting

编译器优化那些事儿(4):归纳变量

Research and practice of super-resolution technology in the field of real-time audio and video

Redis——基本使用(key、String、List、Set 、Zset 、Hash、Geo、Bitmap、Hyperloglog、事务 )

编译原理 实验一:词法分析器的自动实现(Lex词法分析)
![[RT thread env tool installation]](/img/bc/9b39651d40a240f0893200793f67e9.png)
[RT thread env tool installation]

Netease Yunxin participated in the preparation of the standard "real time audio and video service (RTC) basic capability requirements and evaluation methods" issued by the Chinese Academy of Communica

648. 单词替换

谷歌seo外链Backlinks研究工具推荐

开源OA开发平台:合同管理使用手册
随机推荐
Numpy——axis
编译原理 实验一:词法分析器的自动实现(Lex词法分析)
位运算介绍
RESTAPI 版本控制策略【eolink 翻译】
一锅乱炖,npm、yarn cnpm常用命令合集
Numpy——2. Shape of array
2022.07.02
LeetCode 648(C#)
干货分享|DevExpress v22.1原版帮助文档下载集合
Tp6 realize Commission ranking
UCloud是基础云计算服务提供商
【Confluence】JVM内存调整
Longest common prefix (leetcode question 14)
剑指 Offer II 013. 二维子矩阵的和
R语言使用ggplot2函数可视化需要构建泊松回归模型的计数目标变量的直方图分布并分析构建泊松回归模型的可行性
模拟实现string类
Netease Yunxin participated in the preparation of the standard "real time audio and video service (RTC) basic capability requirements and evaluation methods" issued by the Chinese Academy of Communica
Solve the problem of remote rviz error reporting
LC: string conversion integer (ATOI) + appearance sequence + longest common prefix
吞吐量Throughout