当前位置:网站首页>Compiler optimization (4): inductive variables
Compiler optimization (4): inductive variables
2022-07-07 19:52:00 【openEuler】
0. Basic knowledge inventory
0.1 loop (loop)
Definition
loop(llvm It is understood as natural loop) Is defined in CFG A node set in L, And has the following properties [1][2]:
There is a single entry node ( be called header), This node governs loop All the nodes in ; There is a back edge that enters the loop head ;
Related terms
entering block: A non loop The inner node has an edge connected to loop. When there is only one entering block And only one side of it is connected to header, be called preheader; Act as not loop Nodal peheader Dominate the whole loop; latch: There is an edge connected to header; backedge: It's called back side , One from latch To header The edge of ; exiting edge: One side from loop Inward to loop Outside , The starting point of the edge is called exiting block, The target node is called exit block;

In the right picture above , The yellow area is a loop, The red area is not , Why? ?
Because the red area a and c Are all entry nodes , Does not satisfy the nature of a single entry node .
0.2 Scalar Evolution(SCEV)
Definition
SCEV It is the optimization of the compiler to analyze variables ( Often only for integer types ), It is mainly used to analyze how variables are updated in the loop , Then optimize according to this information .
Loop chain
As shown in the figure , Inductive variables in the loop var Starting at start, The way of iteration is ϕ, In steps of step;

Its circular chain (chrec,Chains of Recurrences) as follows :
var = {start, ϕ , step}
// ϕ∈{+,∗}
// start: starting value
// step: step in each iteration
for instance :
int m = 0;
for (int i = 0; i < n; i++) {
m = m + n;
*res = m;
}
that m The cycle chain of is :m = {0,+,n}.
1. Induction Variable( Inductive variables )
1.1 Definition
Each iteration of the loop increases or decreases a fixed amount of variables , Or another linear function of inductive variables .
for instance [3], In the following cycle i and j Are inductive variables :
for (i = 0; i < 10; ++i) {
j = 17 * i;
}
1.2 benefit
Summarize the benefits of variable optimization , There are but not limited to the following points :
Replace the original calculation method with simpler instructions .
such as , Inductive variables are identified in the above example , Replace the corresponding multiplication with a less expensive addition .j = -17;
for (i = 0; i < 10; ++i) {
j = j + 17;
}Reduce the number of inductive variables , Reduce register pressure .
extern int sum;
int foo(int n) {
int i, j;
j = 5;
for (i = 0; i < n; ++i) {
j += 2;
sum += j;
}
return sum;
}Current loop There are two inductive variables :i、j, Use one of the variables to express the other post , as follows :
extern int sum;
int foo(int n) {
int i;
for (i = 0; i < n; ++i) {
sum += 5 + 2 * (i + 1);
}
return sum;
}Inductive variable substitution , Make the relationship between variables and circular indexes clear , It is convenient for other optimization analysis ( Such as dependency analysis ). Examples are as follows , take c Expressed as a function related to circular index :
int c, i;
c = 10;
for (i = 0; i < 10; i++) {
c = c + 5; // c is incremented by 5 for each loop iteration
}Convert to :
int c, i;
c = 10;
for (i = 0; i < 10; i++) {
c = 10 + 5 * (i + 1); // c is explicitly expressed as a function of loop index
}
2. practice
2.1 Related compilation options
| compiler | option |
|---|---|
| gcc | -fivopt |
| Bi Sheng | -indvars |
2.2 Optimize use cases
Optimization of inductive variables (ivs) stay llvm The position in is :llvm\lib\Transforms\Scalar\IndVarSimplify.cpp
Let's pass a use case , Take a look at the optimization process of Bisheng compiler .
Here's the picture , Suppose that func The inner part is the code to be optimized , below func Inside is the expected result :

its IR Use cases test.ll yes :

The compile command is :
opt test.ll -indvars -S
In the current example ,header、latch and exiting block It's all the same BB, namely bb5.

Step one : basis def-use Relationship , Traverse loop Of ExitBlock in phi The source of the operand of the node , Calculate the final value and replace it , Then replace the phi Use of nodes .
In the example , Calculation %tmp2.lcssa , Its only operand is %tmp2 = add nuw nsw i32 %i.01.0, 3 , Where the expression is located loop yes bb5, here %tmp2 The cycle chain of is
%tmp2 = {3,+,3}<nuw><nsw><%bb5>
Get current loop The maximum value of not exiting the loop is 199999, Now %tmp2=add(3, mul(3,199999))=600000; Next, we will see that the current replacement is not expensive ( The calculation of cost will vary according to different architectures ), At the same time phi Nodal user Replace the value in . The optimization results are as follows :

Step two : Traverse ExitingBlock , Calculate the jump condition , basis def-use The relationship between , Delete the corresponding instruction .
In the example , To calculate the br i1 %0, label %bb5, label %bb7 Of %0 yes false, After the jump instruction is replaced ,%0 = icmp ult i32 %tmp4,200000 non-existent user, Add it to “ Dead order ” in . The optimization results are as follows :

Step three : Delete all “ Dead order ”, And see if his operands should be deleted .
In the example , As %0 Of operands %tmp4 And others user %x.03.0, So it can't be regarded as “ Dead order ” Be deleted . The optimization results are as follows :

Step four : Delete HeaderBlock Medium “ die ”phi node .
In the example , %tmp4 and phi node %x.03.0 It forms a cycle without results , Will delete them , Delete... In the same way %tmp2 and %i.01.0 . The optimization results are as follows :

Reference resources
[1] https://llvm.org/docs/LoopTerminology.html
[2] 《 Compiler principle 》 [ beautiful ]Alfred V.Aho,[ beautiful ]Monica S.Lam,[ beautiful ]Ravi Sethi Waiting , Zhao Jianhua , Translated by Zheng Tao, et al
[3] https://en.wikipedia.org/wiki/Induction_variable


Click on Read the original Start using Bisheng compiler
This article is from WeChat official account. - openEuler(openEulercommunity).
If there is any infringement , Please contact the [email protected] Delete .
Participation of this paper “OSC Source creation plan ”, You are welcome to join us , share .
边栏推荐
- L1-025 positive integer a+b (Lua)
- Time tools
- 【剑指offer】剑指 Offer II 012. 左右两边子数组的和相等
- Kunpeng developer summit 2022 | Kirin Xin'an and Kunpeng jointly build a new ecosystem of computing industry
- A pot of stew, a collection of common commands of NPM and yarn cnpm
- Numpy——2. Shape of array
- How to open an account for stock speculation? Excuse me, is it safe to open a stock account by mobile phone?
- What does "true" mean
- [Verilog advanced challenge of Niuke network question brushing series] ~ multi bit MUX synchronizer
- 强化学习-学习笔记8 | Q-learning
猜你喜欢

杰理之手动配对方式【篇】

模拟实现string类

648. 单词替换

Automatic classification of defective photovoltaic module cells in electroluminescence images-論文閱讀筆記

State mode - Unity (finite state machine)

转置卷积理论解释(输入输出大小分析)

openEuler 资源利用率提升之道 01:概论

PV static creation and dynamic creation

Netease Yunxin participated in the preparation of the standard "real time audio and video service (RTC) basic capability requirements and evaluation methods" issued by the Chinese Academy of Communica

Install mysql8 for Linux X ultra detailed graphic tutorial
随机推荐
IP 工具类
论文解读(ValidUtil)《Rethinking the Setting of Semi-supervised Learning on Graphs》
A pot of stew, a collection of common commands of NPM and yarn cnpm
R语言ggplot2可视化:使用ggpubr包的ggdensity函数可视化分组密度图、使用stat_overlay_normal_density函数为每个分组的密度图叠加正太分布曲线
Responsibility chain model - unity
Le PGR est - il utile au travail? Comment choisir une plate - forme fiable pour économiser le cœur et la main - d'œuvre lors de la préparation de l'examen!!!
干货分享|DevExpress v22.1原版帮助文档下载集合
Training IX basic configuration of network services
Empowering smart power construction | Kirin Xin'an high availability cluster management system to ensure the continuity of users' key businesses
Kirin Xin'an with heterogeneous integration cloud financial information and innovation solutions appeared at the 15th Hunan Financial Technology Exchange Conference
关于ssh登录时卡顿30s左右的问题调试处理
J ü rgen schmidhub reviews the 25th anniversary of LSTM papers: long short term memory All computable metaverses. Hierarchical reinforcement learning (RL). Meta-RL. Abstractions in generative adversar
模拟实现string类
8 CAS
Notes...
让这个 CRMEB 单商户微信商城系统火起来,太好用了!
My creation anniversary
爬虫实战(七):爬王者英雄图片
索引总结(突击版本)
LC:字符串转换整数 (atoi) + 外观数列 + 最长公共前缀