当前位置:网站首页>Embedded C language loop deployment
Embedded C language loop deployment
2022-07-27 19:38:00 【WangLanguager】
1、 analysis :
During each cycle , Two instructions are added to the loop body , A subtraction instruction (1 Machine cycles ), A branch instruction (3 Machine cycles ), common 4 Machine cycles , This is the overhead of the system .
2、 improvement :
(1) Repeat the circulatory body many times , Reduce overhead ratio .
(2) The code in the loop body increases , Fewer cycles .
3、 Examples of loop expansion :
int checksum_v9(int *data, unsigned int N)
{
int sum = 0;
do
{
sum += *(data++);
sum += *(data++);
sum += *(data++);
sum += *(data++);
N-=4; // Suppose the number of data to be accumulated is 4 Multiple
} while (N!=0);
return sum;
}
The assembly code of the above program is :
checksum_v9_s
MOV r2,#0 ;sum = 0
checksum_v9_loop
LDR r3,[r0] #4 ;r3 = *(data++)
SUBS r1,r1,#4 ;N-=4 and set flags
ADD r2,r3,r2 ;sum += r3
LDR r3,[r0] #4 ;r3 = *(data++)
ADD r2,r3,r2 ;sum += r3
LDR r3,[r0] #4 ;r3 = *(data++)
ADD r2,r3,r2 ;sum += r3
LDR r3,[r0] #4 ;r3 = *(data++)
ADD r2,r3,r2 ;sum += r3
BNE checksum_v9_loop ;if(N != 0) goto loop
MOV r0,#r2 ;r0 = sum
MOV pc,r14 ;return r0
4、 Discuss
After improvement : The total cycle overhead is from 4N Machine cycles are reduced to N Machine cycles ( Each cycle requires 4 Cycle cost of machine cycles , Reduce to the original 1/4). If the circulation body is smaller , The more obvious the effect of this method , It can even improve efficiency nearly 1 times .
hypothesis N=20, Execute before optimization 83 statement , Then execute after optimization 53 statement .
5、 problem
(1) How much should be expanded ?
(2) The number of expansions is not 4 Multiple ?
6、 The idea of circular expansion
(1) Disadvantages of loop expansion
① Code increase , Take up more memory
②Cache Space occupation problem
(2) So we need specific analysis of specific problems , Find the balance
7、 give an example :
for example : The execution of the loop body in the program requires 128 Machine cycles , The cycle overhead is generally 4 Machine cycles ,
Occupy 3%, If the loop body accounts for 30%, Then the cycle cost accounts for about of the total program 1%, This is the moment to unfold
loop , Performance improvement is limited .
The expansion of the loop , It's possible to destroy cache The content in , Causes jitter , Make the program performance drop sharply .
8、 Suppose the number of data to be accumulated is not 4 Integer multiple
int checksum_v10(int *data, unsigned int N)
{
unsigned int i;
int sum = 0;
for(i = N/4; i != 0; i --)
{
sum += *(data++);
sum += *(data++);
sum += *(data++);
sum += *(data++);
};
for(i = N&3; i != 0; i--)
{
sum += *(data++); // Suppose the number of data to be accumulated is not 4 Multiple
}
return sum;
}
9、 Conclusion
(1) The cycle count value should be decreased , When the counter adopts unsigned number , Terminate with (i!=0), Do not use (i >= 0)
(2) If it is determined that the number of cycles is greater than 1, Then use do{}while Loop structure
(3) For small loop , Cycle expansion can be carried out , Reduce system overhead
(4) Try to make the size of the array as expansion coefficient N Multiple
边栏推荐
- IPFs obtains the public key and private key through the interface, and encrypts the storage. First bullet
- golang设置国内镜像,vscode配置golang开发环境,vscode调试golang代码
- IEC104 规约详细解读(一) 协议结构
- 搭建阿里云+typora+Picgo图床错误分析
- C language printing diamond
- kettle JVM内存设置---效果不明显
- go-zero单体服务使用泛型简化注册Handler路由
- kettle EXCEL 累计输出数据
- 嵌入式C语言指针别名
- 汉字查拼音微信小程序项目源码
猜你喜欢

Definition of graph traversal and depth first search and breadth first search (2)

New system installation mysql+sqlyog

C language: 5. Multidimensional array

The go zero singleton service uses generics to simplify the registration of handler routes

C language: 8. Makefile preparation

Webmagic+selenium+chromedriver+jdbc grabs data vertically.

c语言:13、指针与内存

c语言:7、c语言多源码文件使用方法

Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image

Subscription and use of Alibaba cloud video on demand service
随机推荐
Kettle8.2 installation and common problems
浅谈基本的网络基本故障和问题排查
Golang sets the domestic image, vscode configures the golang development environment, and vscode debugs the golang code
Yanghui triangle
C language printing diamond
C language: 8. Makefile preparation
kettle 分列、合并记录
IPFs obtains the public key and private key through the interface, and encrypts the storage. First bullet
The go zero singleton service uses generics to simplify the registration of handler routes
Debian recaptured the "debian.community" domain name, but it's still not good to stop and rest
时间复杂度和空间复杂度
Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image
27. Basics of golang - mutex lock, read / write lock
c语言:c语言代码风格
Basic use of Nacos (1) - getting started
IEC104 规约详细解读(二)交互流程以及协议解析
mysql学习笔记(1)——变量
正十七边形尺规作图可解性复数证明
VIVO应用市场APP上架总结
Subscription and use of Alibaba cloud video on demand service