当前位置:网站首页>Pan for in-depth understanding of the attention mechanism in CV
Pan for in-depth understanding of the attention mechanism in CV
2022-07-03 18:37:00 【Strawberry sauce toast】
CV Medium Attention Mechanism summary ( 3、 ... and ):PAN
PAN: Pyramid Attention Network
Thesis link :《Pyramid Attention Network for Segmantic Segmentation》
One 、 Abstract
PAN Network structure :
Pyramid Attention Network (PAN) Combine attention mechanism with spatial pyramid to extract dense features more accurately , Improve the accuracy of semantic segmentation . The problem to be solved by this method (motivation) And major contributions include :
1.1 Put forward Feature Pyramid Attention(FPA) Module
- Motivation
The existence of objects at multiple scales cause difficulty in classification of categories.
The different sizes of objects in the same category increase the difficulty of classification , The existing algorithm uses dilated convolution (ASPP) Or pyramids (PSPNet) To increase the accuracy of classification , However, there are problems of grid effect and loss of pixel level location information respectively . - FPA module
FPA module performs spatial pyramid attention structure on high-level output and combine global pooling to learn a better feature representation. ( Including spatial pyramids and global convergence , The output acting on the deep network )
1.2 Put forward Global Attention Upsample(GAU) Module
- Motivation
high-level features are skilled in making category classification, while weak in restructuring original resolution binary prediction.
Deep network can extract more accurate category information , But the location information is lost , This is unfavorable for predicting the position of certain objects . - GAU module
GAU module acts on each decoder layer to provide global context as a guidance of low-level features to select category localization details, ( The feeling field of deep network is larger , Include more accurate category information , Shallow networks contain more accurate location information , The characteristics of the two can be combined to improve the segmentation effect .
Two 、 Module details
2.1 Feature Pyramid Attention

① Extracting local features (pixel-wise)
FPA Modules are used separately 3 × 3 , 5 × 5 , 7 × 7 3\times 3, 5\times 5, 7\times 7 3×3,5×5,7×7 The convolution kernel with three sizes further extracts the features contained in the feature graph , Then add and fuse the feature maps of the three sizes in turn ;
② Extract global features (channel-wise)
increase Global Pooling Access Rd , Extract channel features .( The idea here is similar SE modular , The difference is that only one layer of convolution is used ,SE Two layers were used )
2.2 Global Attention Upsample
The main idea :
The main character of decoder module is to repair category pixel localization. Furthermore, high-level features with abundant category information can be used to weight low-level information to select precise resolution details.
“ decode ” The purpose of is to get the pixels contained in each category , Achieve pixel level classification . The output of the deep network contains richer category information , The output of the shallow network contains more prepared location information , therefore , We can use the category features extracted from the deep network to guide the shallow network to achieve more accurate pixel level classification .
Implementation details :
- Use 3 × 3 3\times 3 3×3 The convolution of low-level The number of channels of the characteristic graph ;
- Use global average pooling extract high-level The overall information of , 1 × 1 1\times 1 1×1 Convolution plus BN Layer and ReLU Activation function ;
- high-level Output after up sampling and weighting low-level The output characteristic image of is added pixel by pixel .
3、 ... and 、PyTorch Code implementation
Reference resources :https://github.com/JaveyWang/Pyramid-Attention-Networks-pytorch
边栏推荐
- [combinatorics] generating function (example of using generating function to solve the number of solutions of indefinite equation)
- [combinatorics] exponential generating function (proving that the exponential generating function solves the arrangement of multiple sets)
- Unity2018 to wechat games without pictures
- How to track the real-time trend of Bank of London
- Torch learning notes (1) -- 19 common ways to create tensor
- Graduation summary
- [combinatorics] generating function (positive integer splitting | unordered | ordered | allowed repetition | not allowed repetition | unordered not repeated splitting | unordered repeated splitting)
- Enterprise custom form engine solution (12) -- form rule engine 2
- Kratos微服务框架下实现CQRS架构模式
- Torch learning notes (5) -- autograd
猜你喜欢

论文阅读 GloDyNE Global Topology Preserving Dynamic Network Embedding

How to track the real-time trend of Bank of London

2022-2028 global plasmid DNA cdmo industry research and trend analysis report
![Golang string (string) and byte array ([]byte) are converted to each other](/img/41/20f445ef9de4adf2a2aa97828cb67f.jpg)
Golang string (string) and byte array ([]byte) are converted to each other

2022-2028 global marking ink industry research and trend analysis report

Real time split network (continuous update)

2022-2028 global aircraft head up display (HUD) industry research and trend analysis report

On Data Mining

Prototype inheritance..

Grammaire anglaise Nom - Classification
随机推荐
Nodejs (01) - introductory tutorial
199. Right view of binary tree - breadth search
Chisel tutorial - 06 Phased summary: implement an FIR filter (chisel implements 4-bit FIR filter and parameterized FIR filter)
Real time split network (continuous update)
Sensor 调试流程
Image 24 bit depth to 8 bit depth
Sensor debugging process
The number of incremental paths in the grid graph [dfs reverse path + memory dfs]
Su embedded training - Day10
NFT new opportunity, multimedia NFT aggregation platform okaleido will be launched soon
Data analysis is popular on the Internet, and the full version of "Introduction to data science" is free to download
[combinatorics] generating function (example of generating function | calculating generating function with given general term formula | calculating general term formula with given generating function)
What does foo mean in programming?
198. Looting - Dynamic Planning
Self executing function
Should I be laid off at the age of 40? IBM is suspected of age discrimination, calling its old employees "dinosaurs" and planning to dismiss, but the employees can't refute it
Enterprise custom form engine solution (12) -- form rule engine 2
Boost. Asio Library
Software development freelancer's Road
Unity2018 to wechat games without pictures
