当前位置:网站首页>Paper reproduction - ac-fpn:attention-guided context feature pyramid network for object detection
Paper reproduction - ac-fpn:attention-guided context feature pyramid network for object detection
2022-06-29 12:29:00 【RooKiChen】
The reproduced paper has been open source , Need to rely on Detectron library , But now most of the papers use mmdetection 了 ( Need to install mmdetection Ku, please read my last article :Ubuntu install mmdetection), Suspect configuration Detectron Library trouble can refer to my reproduced code .
AC-FPN Original thesis :https://arxiv.org/pdf/2005.11475.pdf
AC-FPN Official code :https://github.com/Caojunxu/AC-FPN
It is not implemented in the source code CxAM and CnAM!!! The reason is that the author has made many experiments and found , A separate CEM The module can also work well .
The specific implementation details of some papers are not clear , So I reproduce it according to my own understanding , If there are different methods, welcome to discuss in the comment area .
List of articles
1.AC-FPN The overall structure

AC-FPN It is used to solve the contradiction between receptive field and feature map in high-resolution images , Intuitively speaking, high-resolution images need larger receptive fields , But the detection effect of large receptive field on small targets is not good , Will misjudge the small target as the background , Based on the above problems ,AC-FPN Two modules are proposed to solve this contradiction , They are context extraction modules (CEM) And pay attention to the boot module (attention -guided module, AM), and AM There are two sub modules in , They are context attention modules (Context Attention Module, CxAM) And content attention module (Content Attention Module, CnAM). The innovation is to use different proportions of hole convolution to extract features ,AM Module CnAM and CxAM Follow self-attention The idea is similar to .
2.CEM:Context Extraction Module
In the original paper, the author said CEM A module is a feature map F5 Five rate The void convolution of , Then make dense links ( Dense links to the original paper ), It can also be seen from the figure that each feature map has arrow links with other feature maps , Then deformable convolution is performed on each characteristic graph , Finally, these feature maps are spliced together 1x1 To jump the number of channels . But there is no deformable convolution in the source code , Moreover, this deformable convolution in the paper is also passed by , So I didn't add deformable convolution in my implementation . You can see that there is another one under the hole convolution structure diagram upsampling operation , However, there is no implementation in the source code. I think it is compressed into a 1x1xC Eigenvector of , The function of this vector should be similar to that of spatial attention , Maybe the effect of the author's addition is not good , In the source code, we will abandon the structure .
Personally think that CEM The reason why the effect is very good is that one is added after each cavity convolution GroupNorm operation , This is not given in the original paper . And BatchNorm Different ,GroupNorm You don't need a big one batch_size( Training COCO Data sets ,batch_size It's usually 2).
3. AM: Attention-guided Module
3.1 CxAM

This module is an ordinary self-attention, It's just in the feature map R The average pooling operation is added later .F yes CEM The output characteristics of , from CEM Generate and contain multiscale receptive field information , Put in CxAM modular . Based on this information ,CxAM Adaptively focus on the relationship between related sub regions . therefore , Output CxAM The functionality of will have clear semantics and include context dependencies within surrounding objects .
3.2 CnAM

CnAM Structure follows CxAM Structure is the same , The original paper says that because CEM Deformable convolution is used , The geometric characteristics of the given image have been completely destroyed , This causes the position to shift . So , We designed a new attention module , Called the content attention module (CnAM), To maintain accurate location information for each object .
The difference is this CnAM Take advantage of F5 Of feature map As an input, make up for the damaged positioning information .
4. Training strategy
Here are my personal training strategies : The backbone network is ResNet50, Again COCO Training on dataset 12 round , Used 8 block 40G The memory GPU, each GPU On 2 A picture , The initial learning rate is 0.02, And in the 8 Round and Chapter 11 Wheel descent 0.1 times , It's not bad
5. Duplicate code
Code synchronized to GitHub:https://github.com/RooKichenn/AC-FPN
边栏推荐
- Gbase8s database sorts standard or raw result tables
- 力扣每日一题-第31天-1779.找到最近的有相同x或y坐标的点
- [pbootcms template] composition website / document download website source code
- [leetcode] 14. Longest public prefix
- Li Kou daily question - day 31 -1779 Find the nearest point with the same X or Y coordinate
- GBase8s数据库FOR READ ONLY 子句
- Serving millions of developers, the first techo day Tencent technology open day released 7 "lightweight" products
- Gbase8s database select has a having clause
- 535. encryption and decryption of tinyurl: design a URL simplification system
- 如何查看网站已经保存的密码
猜你喜欢

How to install oracle19c in Centos8

Intelligent trash can (IV) -- raspberry pie Pico realizes ultrasonic ranging (hc-sr04)

JVM之方法区

黑化的蜜雪冰城,凭营销就想抓牢消费者的心?

ERP编制物料清单 华夏

Method area of JVM

Go Senior Engineer required course | I sincerely suggest you listen to it. Don't miss it~

缓存一致性,删除缓存,写入缓存,缓存击穿,缓存穿透,缓存雪崩

面试突击61:说一下MySQL事务隔离级别?

【综合案例】信用卡虚拟交易识别
随机推荐
MySQL master-slave synchronous asynchronous replication semi synchronous replication full synchronous replication
求大数的阶乘 ← C语言
When you are young, you should be awake to fight, and when you are young, you should have the courage to try
地球观测卫星数据
MySQL主从同步之 异步复制 半同步复制 全同步复制
《自卑与超越》生活对你应有的意义
Engineering practice behind dall-e 2: ensure that the output of the model complies with the content policy
[JUC series] ThreadLocal of synchronization tool class
AutoCAD - text display mode and how CAD can directly open Tianzheng drawings
Go Senior Engineer required course | I sincerely suggest you listen to it. Don't miss it~
GBase8s数据库select有ORDER BY 子句3
535. TinyURL 的加密与解密 : 设计一个 URL 简化系统
LM07丨细聊期货横截面策略
架构实战营第五模块课后作业
Li Kou daily question - day 31 -13 Roman array to integer
Li Kou daily question - day 31 -1779 Find the nearest point with the same X or Y coordinate
GBase8s数据库select有HAVING 子句
GBase8s数据库INTO table 子句
Quick look | the long-awaited 2022 Guangzhou assistant testing engineer's real problem analysis is finally released
GBase8s数据库INTO TEMP 子句创建临时表来保存查询结果。