当前位置:网站首页>Compiler optimization (4): inductive variables
Compiler optimization (4): inductive variables
2022-07-07 19:52:00 【openEuler】
0. Basic knowledge inventory
0.1 loop (loop)
Definition
loop(llvm It is understood as natural loop) Is defined in CFG A node set in L, And has the following properties [1][2]:
There is a single entry node ( be called header), This node governs loop All the nodes in ; There is a back edge that enters the loop head ;
Related terms
entering block: A non loop The inner node has an edge connected to loop. When there is only one entering block And only one side of it is connected to header, be called preheader; Act as not loop Nodal peheader Dominate the whole loop; latch: There is an edge connected to header; backedge: It's called back side , One from latch To header The edge of ; exiting edge: One side from loop Inward to loop Outside , The starting point of the edge is called exiting block, The target node is called exit block;

In the right picture above , The yellow area is a loop, The red area is not , Why? ?
Because the red area a and c Are all entry nodes , Does not satisfy the nature of a single entry node .
0.2 Scalar Evolution(SCEV)
Definition
SCEV It is the optimization of the compiler to analyze variables ( Often only for integer types ), It is mainly used to analyze how variables are updated in the loop , Then optimize according to this information .
Loop chain
As shown in the figure , Inductive variables in the loop var Starting at start, The way of iteration is ϕ, In steps of step;

Its circular chain (chrec,Chains of Recurrences) as follows :
var = {start, ϕ , step}
// ϕ∈{+,∗}
// start: starting value
// step: step in each iteration
for instance :
int m = 0;
for (int i = 0; i < n; i++) {
m = m + n;
*res = m;
}
that m The cycle chain of is :m = {0,+,n}.
1. Induction Variable( Inductive variables )
1.1 Definition
Each iteration of the loop increases or decreases a fixed amount of variables , Or another linear function of inductive variables .
for instance [3], In the following cycle i and j Are inductive variables :
for (i = 0; i < 10; ++i) {
j = 17 * i;
}
1.2 benefit
Summarize the benefits of variable optimization , There are but not limited to the following points :
Replace the original calculation method with simpler instructions .
such as , Inductive variables are identified in the above example , Replace the corresponding multiplication with a less expensive addition .j = -17;
for (i = 0; i < 10; ++i) {
j = j + 17;
}Reduce the number of inductive variables , Reduce register pressure .
extern int sum;
int foo(int n) {
int i, j;
j = 5;
for (i = 0; i < n; ++i) {
j += 2;
sum += j;
}
return sum;
}Current loop There are two inductive variables :i、j, Use one of the variables to express the other post , as follows :
extern int sum;
int foo(int n) {
int i;
for (i = 0; i < n; ++i) {
sum += 5 + 2 * (i + 1);
}
return sum;
}Inductive variable substitution , Make the relationship between variables and circular indexes clear , It is convenient for other optimization analysis ( Such as dependency analysis ). Examples are as follows , take c Expressed as a function related to circular index :
int c, i;
c = 10;
for (i = 0; i < 10; i++) {
c = c + 5; // c is incremented by 5 for each loop iteration
}Convert to :
int c, i;
c = 10;
for (i = 0; i < 10; i++) {
c = 10 + 5 * (i + 1); // c is explicitly expressed as a function of loop index
}
2. practice
2.1 Related compilation options
| compiler | option |
|---|---|
| gcc | -fivopt |
| Bi Sheng | -indvars |
2.2 Optimize use cases
Optimization of inductive variables (ivs) stay llvm The position in is :llvm\lib\Transforms\Scalar\IndVarSimplify.cpp
Let's pass a use case , Take a look at the optimization process of Bisheng compiler .
Here's the picture , Suppose that func The inner part is the code to be optimized , below func Inside is the expected result :

its IR Use cases test.ll yes :

The compile command is :
opt test.ll -indvars -S
In the current example ,header、latch and exiting block It's all the same BB, namely bb5.

Step one : basis def-use Relationship , Traverse loop Of ExitBlock in phi The source of the operand of the node , Calculate the final value and replace it , Then replace the phi Use of nodes .
In the example , Calculation %tmp2.lcssa , Its only operand is %tmp2 = add nuw nsw i32 %i.01.0, 3 , Where the expression is located loop yes bb5, here %tmp2 The cycle chain of is
%tmp2 = {3,+,3}<nuw><nsw><%bb5>
Get current loop The maximum value of not exiting the loop is 199999, Now %tmp2=add(3, mul(3,199999))=600000; Next, we will see that the current replacement is not expensive ( The calculation of cost will vary according to different architectures ), At the same time phi Nodal user Replace the value in . The optimization results are as follows :

Step two : Traverse ExitingBlock , Calculate the jump condition , basis def-use The relationship between , Delete the corresponding instruction .
In the example , To calculate the br i1 %0, label %bb5, label %bb7 Of %0 yes false, After the jump instruction is replaced ,%0 = icmp ult i32 %tmp4,200000 non-existent user, Add it to “ Dead order ” in . The optimization results are as follows :

Step three : Delete all “ Dead order ”, And see if his operands should be deleted .
In the example , As %0 Of operands %tmp4 And others user %x.03.0, So it can't be regarded as “ Dead order ” Be deleted . The optimization results are as follows :

Step four : Delete HeaderBlock Medium “ die ”phi node .
In the example , %tmp4 and phi node %x.03.0 It forms a cycle without results , Will delete them , Delete... In the same way %tmp2 and %i.01.0 . The optimization results are as follows :

Reference resources
[1] https://llvm.org/docs/LoopTerminology.html
[2] 《 Compiler principle 》 [ beautiful ]Alfred V.Aho,[ beautiful ]Monica S.Lam,[ beautiful ]Ravi Sethi Waiting , Zhao Jianhua , Translated by Zheng Tao, et al
[3] https://en.wikipedia.org/wiki/Induction_variable


Click on Read the original Start using Bisheng compiler
This article is from WeChat official account. - openEuler(openEulercommunity).
If there is any infringement , Please contact the [email protected] Delete .
Participation of this paper “OSC Source creation plan ”, You are welcome to join us , share .
边栏推荐
- R language dplyr package select function, group_ The by function, filter function and do function obtain the third largest value of a specific numerical data column in a specified level in a specified
- L1-025 positive integer a+b (Lua)
- 位运算介绍
- Kunpeng developer summit 2022 | Kirin Xin'an and Kunpeng jointly build a new ecosystem of computing industry
- Numpy——axis
- Is PMP beneficial to work? How to choose a reliable platform to make it easier to prepare for the exam!!!
- 剑指 Offer II 013. 二维子矩阵的和
- Make this crmeb single merchant wechat mall system popular, so easy to use!
- 一锅乱炖,npm、yarn cnpm常用命令合集
- ASP. Net gymnasium integrated member management system source code, free sharing
猜你喜欢

编译器优化那些事儿(4):归纳变量
Make this crmeb single merchant wechat mall system popular, so easy to use!

Empowering smart power construction | Kirin Xin'an high availability cluster management system to ensure the continuity of users' key businesses

8 CAS

State mode - Unity (finite state machine)

Flink并行度和Slot详解

The strength index of specialized and new software development enterprises was released, and Kirin Xin'an was honored on the list

2022.07.04

一张图深入的理解FP/FN/Precision/Recall

8 CAS
随机推荐
线性基
Matplotlib drawing 3D graphics
Kirin Xin'an won the bid for the new generation dispatching project of State Grid!
Visual Studio 插件之CodeMaid自动整理代码
网信办公布《数据出境安全评估办法》,9 月 1 日起施行
Install mysql8 for Linux X ultra detailed graphic tutorial
Jürgen Schmidhuber回顾LSTM论文等发表25周年:Long Short-Term Memory. All computable metaverses. Hierarchical reinforcement learning (RL). Meta-RL. Abstractions in generative adversarial RL. Soccer learn
[Verilog advanced challenge of Niuke network question brushing series] ~ multi bit MUX synchronizer
Kirin Xin'an joins Ningxia commercial cipher Association
MySQL、sqlserver oracle数据库连接方式
Longest common prefix (leetcode question 14)
openEuler 有奖捉虫活动,来参与一下?
Netease Yunxin participated in the preparation of the standard "real time audio and video service (RTC) basic capability requirements and evaluation methods" issued by the Chinese Academy of Communica
Dynamic addition of El upload upload component; El upload dynamically uploads files; El upload distinguishes which component uploads the file.
8 CAS
A pot of stew, a collection of common commands of NPM and yarn cnpm
R language ggplot2 visualization: use the ggecdf function of ggpubr package to visualize the grouping experience cumulative density distribution function curve, and the linetype parameter to specify t
R语言ggplot2可视化:使用ggpubr包的ggstripchart函数可视化分组点状条带图(dot strip plot)、设置position参数配置不同分组数据点的分离程度
The strength index of specialized and new software development enterprises was released, and Kirin Xin'an was honored on the list
Tips and tricks of image segmentation summarized from 39 Kabul competitions