当前位置:网站首页>Pan for in-depth understanding of the attention mechanism in CV
Pan for in-depth understanding of the attention mechanism in CV
2022-07-03 18:37:00 【Strawberry sauce toast】
CV Medium Attention Mechanism summary ( 3、 ... and ):PAN
PAN: Pyramid Attention Network
Thesis link :《Pyramid Attention Network for Segmantic Segmentation》
One 、 Abstract
PAN Network structure :
Pyramid Attention Network (PAN) Combine attention mechanism with spatial pyramid to extract dense features more accurately , Improve the accuracy of semantic segmentation . The problem to be solved by this method (motivation) And major contributions include :
1.1 Put forward Feature Pyramid Attention(FPA) Module
- Motivation
The existence of objects at multiple scales cause difficulty in classification of categories.
The different sizes of objects in the same category increase the difficulty of classification , The existing algorithm uses dilated convolution (ASPP) Or pyramids (PSPNet) To increase the accuracy of classification , However, there are problems of grid effect and loss of pixel level location information respectively . - FPA module
FPA module performs spatial pyramid attention structure on high-level output and combine global pooling to learn a better feature representation. ( Including spatial pyramids and global convergence , The output acting on the deep network )
1.2 Put forward Global Attention Upsample(GAU) Module
- Motivation
high-level features are skilled in making category classification, while weak in restructuring original resolution binary prediction.
Deep network can extract more accurate category information , But the location information is lost , This is unfavorable for predicting the position of certain objects . - GAU module
GAU module acts on each decoder layer to provide global context as a guidance of low-level features to select category localization details, ( The feeling field of deep network is larger , Include more accurate category information , Shallow networks contain more accurate location information , The characteristics of the two can be combined to improve the segmentation effect .
Two 、 Module details
2.1 Feature Pyramid Attention

① Extracting local features (pixel-wise)
FPA Modules are used separately 3 × 3 , 5 × 5 , 7 × 7 3\times 3, 5\times 5, 7\times 7 3×3,5×5,7×7 The convolution kernel with three sizes further extracts the features contained in the feature graph , Then add and fuse the feature maps of the three sizes in turn ;
② Extract global features (channel-wise)
increase Global Pooling Access Rd , Extract channel features .( The idea here is similar SE modular , The difference is that only one layer of convolution is used ,SE Two layers were used )
2.2 Global Attention Upsample
The main idea :
The main character of decoder module is to repair category pixel localization. Furthermore, high-level features with abundant category information can be used to weight low-level information to select precise resolution details.
“ decode ” The purpose of is to get the pixels contained in each category , Achieve pixel level classification . The output of the deep network contains richer category information , The output of the shallow network contains more prepared location information , therefore , We can use the category features extracted from the deep network to guide the shallow network to achieve more accurate pixel level classification .
Implementation details :
- Use 3 × 3 3\times 3 3×3 The convolution of low-level The number of channels of the characteristic graph ;
- Use global average pooling extract high-level The overall information of , 1 × 1 1\times 1 1×1 Convolution plus BN Layer and ReLU Activation function ;
- high-level Output after up sampling and weighting low-level The output characteristic image of is added pixel by pixel .
3、 ... and 、PyTorch Code implementation
Reference resources :https://github.com/JaveyWang/Pyramid-Attention-Networks-pytorch
边栏推荐
- The second largest gay dating website in the world was exposed, and the status of programmers in 2022
- Niuke monthly race 31 minus integer
- 12、 Service management
- Summary and Reflection on the third week of winter vacation
- English grammar_ Adjective / adverb Level 3 - multiple expression
- 组策略中开机脚本与登录脚本所使用的用户身份
- [combinatorics] generating function (use generating function to solve the number of solutions of indefinite equation)
- Sensor debugging process
- Torch learning notes (1) -- 19 common ways to create tensor
- Golang string (string) and byte array ([]byte) are converted to each other
猜你喜欢

Data analysis is popular on the Internet, and the full version of "Introduction to data science" is free to download
知其然,而知其所以然,JS 对象创建与继承【汇总梳理】

NFT新的契机,多媒体NFT聚合平台OKALEIDO即将上线
![Bloom filter [proposed by bloom in 1970; redis cache penetration solution]](/img/f9/27a75454b464d59b9b3465d25fe070.jpg)
Bloom filter [proposed by bloom in 1970; redis cache penetration solution]

Multifunctional web file manager filestash

How to analyze the rising and falling rules of London gold trend chart

Mysql45 lecture learning notes (II)

How to draw non overlapping bubble chart in MATLAB

Should I be laid off at the age of 40? IBM is suspected of age discrimination, calling its old employees "dinosaurs" and planning to dismiss, but the employees can't refute it

2022-2028 global scar care product industry research and trend analysis report
随机推荐
Torch learning notes (5) -- autograd
Suffix derivation based on query object fields
Enterprise custom form engine solution (12) -- form rule engine 2
Transformer T5 model read slowly
TypeScript 官网教程
webcodecs
Prototype inheritance..
How to draw non overlapping bubble chart in MATLAB
Sensor debugging process
2022-2028 global aircraft head up display (HUD) industry research and trend analysis report
Redis on local access server
Day-27 database
041. (2.10) talk about manpower outsourcing
Computer graduation design PHP sports goods online sales system website
How many convolution methods does deep learning have? (including drawings)
What London Silver Trading software supports multiple languages
How to expand the capacity of golang slice slice
Nodejs (01) - introductory tutorial
[combinatorics] generating function (positive integer splitting | repeated ordered splitting | non repeated ordered splitting | proof of the number of repeated ordered splitting schemes)
Data analysis is popular on the Internet, and the full version of "Introduction to data science" is free to download
