当前位置:网站首页>Pan for in-depth understanding of the attention mechanism in CV
Pan for in-depth understanding of the attention mechanism in CV
2022-07-03 18:37:00 【Strawberry sauce toast】
CV Medium Attention Mechanism summary ( 3、 ... and ):PAN
PAN: Pyramid Attention Network
Thesis link :《Pyramid Attention Network for Segmantic Segmentation》
One 、 Abstract
PAN Network structure :
Pyramid Attention Network (PAN) Combine attention mechanism with spatial pyramid to extract dense features more accurately , Improve the accuracy of semantic segmentation . The problem to be solved by this method (motivation) And major contributions include :
1.1 Put forward Feature Pyramid Attention(FPA) Module
- Motivation
The existence of objects at multiple scales cause difficulty in classification of categories.
The different sizes of objects in the same category increase the difficulty of classification , The existing algorithm uses dilated convolution (ASPP) Or pyramids (PSPNet) To increase the accuracy of classification , However, there are problems of grid effect and loss of pixel level location information respectively . - FPA module
FPA module performs spatial pyramid attention structure on high-level output and combine global pooling to learn a better feature representation. ( Including spatial pyramids and global convergence , The output acting on the deep network )
1.2 Put forward Global Attention Upsample(GAU) Module
- Motivation
high-level features are skilled in making category classification, while weak in restructuring original resolution binary prediction.
Deep network can extract more accurate category information , But the location information is lost , This is unfavorable for predicting the position of certain objects . - GAU module
GAU module acts on each decoder layer to provide global context as a guidance of low-level features to select category localization details, ( The feeling field of deep network is larger , Include more accurate category information , Shallow networks contain more accurate location information , The characteristics of the two can be combined to improve the segmentation effect .
Two 、 Module details
2.1 Feature Pyramid Attention
① Extracting local features (pixel-wise)
FPA Modules are used separately 3 × 3 , 5 × 5 , 7 × 7 3\times 3, 5\times 5, 7\times 7 3×3,5×5,7×7 The convolution kernel with three sizes further extracts the features contained in the feature graph , Then add and fuse the feature maps of the three sizes in turn ;
② Extract global features (channel-wise)
increase Global Pooling Access Rd , Extract channel features .( The idea here is similar SE modular , The difference is that only one layer of convolution is used ,SE Two layers were used )
2.2 Global Attention Upsample
The main idea :
The main character of decoder module is to repair category pixel localization. Furthermore, high-level features with abundant category information can be used to weight low-level information to select precise resolution details.
“ decode ” The purpose of is to get the pixels contained in each category , Achieve pixel level classification . The output of the deep network contains richer category information , The output of the shallow network contains more prepared location information , therefore , We can use the category features extracted from the deep network to guide the shallow network to achieve more accurate pixel level classification .
Implementation details :
- Use 3 × 3 3\times 3 3×3 The convolution of low-level The number of channels of the characteristic graph ;
- Use global average pooling extract high-level The overall information of , 1 × 1 1\times 1 1×1 Convolution plus BN Layer and ReLU Activation function ;
- high-level Output after up sampling and weighting low-level The output characteristic image of is added pixel by pixel .
3、 ... and 、PyTorch Code implementation
Reference resources :https://github.com/JaveyWang/Pyramid-Attention-Networks-pytorch
边栏推荐
- [combinatorics] generating function (use generating function to solve the number of solutions of indefinite equation)
- How does GCN use large convolution instead of small convolution? (the explanation of the paper includes super detailed notes + Chinese English comparison + pictures)
- [tutorial] build your first application on coreos
- Redis on local access server
- 图像24位深度转8位深度
- CTO and programmer were both sentenced for losing control of the crawler
- Mature port AI ceaspectus leads the world in the application of AI in terminals, CIMC Feitong advanced products go global, smart terminals, intelligent ports, intelligent terminals
- 2022-2028 global copper foil (thickness 12 μ M) industry research and trend analysis report
- Prototype inheritance..
- [combinatorics] exponential generating function (properties of exponential generating function | exponential generating function solving multiple set arrangement)
猜你喜欢
2022-2028 global petroleum pipe joint industry research and trend analysis report
Computer graduation design PHP campus address book telephone number inquiry system
Bidding procurement scheme management of Oracle project management system
[combinatorics] dislocation problem (recursive formula | general term formula | derivation process)*
Golang string (string) and byte array ([]byte) are converted to each other
[enumeration] annoying frogs always step on my rice fields: (who is the most hateful? (POJ hundred practice 2812)
Xception for deeplab v3+ (including super detailed code comments and original drawing of the paper)
[leetcode周赛]第300场——6110. 网格图中递增路径的数目-较难
English grammar_ Adjective / adverb Level 3 - multiple expression
2022-2028 global lithium battery copper foil industry research and trend analysis report
随机推荐
204. Count prime
Summary and Reflection on the third week of winter vacation
Bidding procurement scheme management of Oracle project management system
[combinatorics] generating function (commutative property | derivative property | integral property)
[combinatorics] generating function (positive integer splitting | repeated ordered splitting | non repeated ordered splitting | proof of the number of repeated ordered splitting schemes)
网格图中递增路径的数目[dfs逆向路径+记忆dfs]
199. Right view of binary tree - breadth search
Data analysis is popular on the Internet, and the full version of "Introduction to data science" is free to download
Kratos微服务框架下实现CQRS架构模式
042. (2.11) do it when it's time to do it
NFT新的契机,多媒体NFT聚合平台OKALEIDO即将上线
Torch learning notes (5) -- autograd
组策略中开机脚本与登录脚本所使用的用户身份
Unsafe类的使用
Reading a line from ifstream into a string variable
English语法_名词 - 分类
硬盘监控和分析工具:Smartctl
Redis cache avalanche, penetration, breakdown
知其然,而知其所以然,JS 对象创建与继承【汇总梳理】
12、 Service management