当前位置:网站首页>Amway! How to provide high-quality issue? That's what Xueba wrote!
Amway! How to provide high-quality issue? That's what Xueba wrote!
2022-06-26 23:56:00 【Shengsi mindspire】

introduction
This tweet is for Shengsi MindSpore The quality of the community ISSUE, Amway gives you reference and learning , Click below “ Read the original ” You can jump gitee original text , more ISSUE Please see the link for details :
https://gitee.com/mindspore/mindspore/issues
summary
UB The fusion process is as follows :
adopt Pass Match the operator to be fused , take fusion id Set to node Properties of
Fuse the matched small operators into FusionOp
Initialize fusion information , Including determining fusion scope Included nodes 、 Determine the input and output nodes, etc
Check each fusion scope Whether it will form a ring ( call CheckCircle function )
Compile the fusion operator , Compilation failed fusion scope Do not do UB The fusion
For each fusion scope, First, check whether the ring is formed ( call CheckCircle function ), If it does not form a ring, create a fusion operator and replace it on the original graph
According to the actual measurement ,CheckCircle Function in UB The time-consuming proportion of fusion is 86%, So you need to CheckCircle To optimize .
UB There is no ring on the graph before merging , But after fusion, a ring may be formed . As shown in the figure below :

hold A、B and C Nodes are merged into E after ,E and D Formed a ring .

Optimization point
Optimization point 1
From the above processing flow , There are two ring forming inspections :
Check before compiling the operator for the first time , The goal is to prevent compilation from taking too long , Eliminate unnecessary operator compilation in advance through looping check , Lifting performance . From the test data , At this time CheckCircle() Function called 1719 Time , among 51 Secondary cyclization , The ring forming ratio is only 3%, So this check can delete .
The second time is at every FusionOp Before replacing , The check at this time cannot be deleted .
An optimization method : Delete the first looping inspection .

Optimization point 2
Current looping inspection , from fusion scope Input node of ( Image below C and D) Start traversing its predecessor nodes ( The figure below C and D The precursors of all are B).
As shown in the figure below , First input from C To traverse the C->B-A, And then input D To traverse the D->B->A, And that leads to B and A Repeatedly visited .

So I'm checking fusion scope when , You can record the visited nodes , Avoid repeated visits .
Optimization point 3
At present UB Integrated pattern Basically, they are single input and single output structures :

According to this feature, some optimization can be done , Avoid unnecessary checks .
If all inputs are passed to fusion scope The first entry node in the ( The figure below 2 In a scene C node ), Then the fusion will not form a ring , Looping check can be skipped .

If the input is passed to different entry nodes ( In the left C and D) Or not the first entry node (fusion scope The internal nodes are topologically ordered , Judging right graph from topological order C Not the first entry node ) It may form a ring .

in consideration of UB The fusion pattern Features and implementation complexity of , Make the following judgments : If all inputs are passed to fusion scope The first node in the , Skip the looping inspection .
Because topology sorting has been done , So the first node must be the entry node .
This condition is too strict , Some other scenarios do not need to be checked , Consider other scenarios in UB The proportion of integration is very small , Temporary does not support .
It is found in the actual measurement that fusion scope Accounted for as :1616/1719 = 94%.
Optimization point 4
If all outputs are generated by fusion scope The last exit node in the ( On the left of the figure below B And the one on the right C node ) produce , Then the fusion will not form a ring , Looping check can be skipped .

If all outputs are from different exit nodes ( In the left B and D) Produced or produced by intermediate nodes ( On the right B) It may form a ring .

in consideration of UB The fusion pattern Features and implementation complexity of , Make the following judgments : If all outputs are generated by fusion scope The last exit node in the , Skip the looping inspection .
It is found in the actual measurement that fusion scope Accounted for as :1617/1719 = 94%, And optimization points 3 The proportion is the same .
Verification effect
Time units in the following table : second .



MindSpore Official information
GitHub : https://github.com/mindspore-ai/mindspore
Gitee : https : //gitee.com/mindspore/mindspore
official QQ Group : 486831414
边栏推荐
- 通过两个stack来实现Queue
- 手机上炒股开户可靠吗 网上开户炒股安全吗
- Pinpoint attackers with burp
- PHP代码审计系列(一) 基础:方法、思路、流程
- Solid and ambient colors
- 电子协会 C语言 1级 29 、 对齐输出
- 简单测试轻量级表达式计算器Flee
- Is the low commission free account opening channel safe?
- 【界面】pyqt5和Swin Transformer对人脸进行识别
- Electronic Society C language level 1 29, alignment output
猜你喜欢

go语言的爬虫和中间件
![[微服务]Nacos](/img/69/6641e943c4366d5591acdf9e12389c.png)
[微服务]Nacos
![[interface] pyqt5 and swing transformer for face recognition](/img/37/b259627a8ffd82afe8e8f3029bf290.png)
[interface] pyqt5 and swing transformer for face recognition

颜色搭配和相关问题
![How to download on selenium computer -selenium download and installation graphic tutorial [ultra detailed]](/img/ec/1c324dcf38d07742a139aac2bab02e.png)
How to download on selenium computer -selenium download and installation graphic tutorial [ultra detailed]
![[microservices] understanding microservices](/img/62/e826e692e7fd6e6e8dab2baa4dd170.png)
[microservices] understanding microservices

Unityeditor Editor Extension - table function

您的连接不是私密连接

Pinpoint attackers with burp

Analysis on the advantages and disadvantages of the best 12 project management systems at home and abroad
随机推荐
不会写免杀也能轻松过defender上线CS
利用burp精准定位攻击者
万字详解-MindArmour 小白教程!
安利!如何提优质的ISSUE?学霸是这样写的!
[微服务]认识微服务
互联网行业,常见含金量高的证书,看看你有几个?
PHP代码审计系列(一) 基础:方法、思路、流程
超硬核!华为智慧屏上的家庭相册竟可以自动精准分类?
Let agile return to its original source -- Some Thoughts on reading the way of agile neatness
【强基计划】数学与物理竞赛中的微积分部分视频
Is it safe to open an account and speculate in stocks on the mobile phone? Is it safe to open an account and speculate in stocks on the Internet
golang语言的开发学习路线
电子协会 C语言 1级 31 、 计算线段长度
go语言中的私聊功能处理
一篇文章带你学会容器逃逸
浅谈分布式系统开发技术中的CAP定理
MindSpore新型轻量级神经网络GhostNet,在ImageNet分类、图像识别和目标检测等多个应用场景效果优异!
6.24 learning content
通过两个stack来实现Queue
leetcode 1143. Longest Commom Subsequence 最长公共子序列(中等)