当前位置:网站首页>Detector: detect objects with recursive feature pyramid and switchable atolos convolution
Detector: detect objects with recursive feature pyramid and switchable atolos convolution
2022-07-23 16:48:00 【TJMtaotao】
Abstract
Many modern target detectors use “ Think twice before ” Mechanism , It shows excellent performance . This paper applies this mechanism to the trunk design of target detection . At the macro level , We propose a recursive feature pyramid , It combines additional feedback connections from the feature pyramid network into a bottom-up backbone layer . At the micro level , We propose a switchable antitrust convolution , The convolution is characterized by convolution at different antitrust rates , And use the switch function to collect the results . Combine them together to form a detector , The performance of target detection is greatly improved . stay COCO On the test development platform , The detector realizes target detection 54.7% Of box-AP state , Instance segmentation 47.1% Of mask-AP state , Panoramic segmentation 49.6% Of PQ state .https://github.com/joe-siyuan-qiao/DetectoRS
1. Introduction
To detect objects , Human visual perception transmits high-level semantic information through feedback connection , Selectively enhance and inhibit the activation of neurons [2,19,20]. Inspired by the human visual system , The mechanism of secondary vision and secondary thinking in computer vision has been instantiated , And show excellent performance [5,6,58]. Many popular two-stage target detectors , Such as fast R-CNN[58], First, output the target suggestion , Then, according to these suggestions, regional features are extracted to detect the target . In the same direction ,Cascade R-CNN[5] A multistage detector is developed , In this detector , The subsequent detector head is trained to be a more selective example . The success of this design idea inspired us to explore it in the neural network backbone design of target detection . especially , We have adopted this mechanism at both macro and micro levels , Thus, our proposed detector greatly improves the current most advanced target detector HTC[7] Performance of , At the same time, the reasoning speed remains unchanged , As shown in Table 1 .

At the macro level , We propose a recursive feature pyramid (RFP) It is based on the feature pyramid network (FPN) Above [44], It will come from FPN The additional feedback connections of the layer are merged into the bottom-up backbone layer , Pictured 1a Shown . Expand the recursive structure into sequential implementation , We got the trunk of a target detector , It can observe two or more images . Similar to cascade R-CNN Cascade detector head in , our RFP Recursively enhance FPN To generate an increasingly powerful representation . A network similar to deep monitoring [36], The feedback connection brings the features of the gradient received directly from the detector head back to the low level of the bottom-up trunk , To speed up training and improve performance . We propose RFP It realizes a design of two consecutive searches and thinking , Bottom up backbone and FPN Run multiple times , Its output characteristics depend on the characteristics in the previous steps .
At the micro level , We propose a switchable atolos convolution (SAC), It convolutes the same input characteristics at different atolos rates [11,30,53], And use the switch function to collect the results . chart 1b Show SAC An illustration of the concept of . The switching function is spatially related , That is, each position of feature mapping may have different switches to control SAC Output . For use in detectors SAC, We will the standards in the bottom-up backbone 3x3 All convolution layers are converted to SAC, The performance of the detector is greatly improved . Some previous methods used conditional convolution , for example [39,74], It also combines the results of different convolutions into a single output . Different from those architectural requirements
To train from scratch ,SAC Provides a mechanism , The pre trained standard convolutional network can be easily converted ( for example ImageNet pretrained[59] checkpoint ). Besides , stay SAC A new weight locking mechanism is used in , In addition to the trainable differences , The weight of different materials is the same .
Combined with the suggested RFP and SAC The result is in our detector . In order to prove its validity , We are in a challenging COCO Data sets [47] The detector is incorporated into the most advanced HTC[7]. stay COCO In test development , We report on... For object detection box AP[22]、 For instance segmentation mask AP[26] And for panoramic segmentation PQ[34]. With ResNet-50[28] The detector for the trunk is significantly improved HTC[7]7.7% Of box-AP and 5.9% Of mask-AP. Besides , Equip our detector with ResNeXt-101-32x4d[71] Can achieve the most advanced 54.7% Box type AP and 47.1% Mask AP. add DeepLabv3+[14] With Wide-ResNet-41[10] Material prediction for the backbone , The detector creates for panoramic segmentation 49.6% Of PQ New record .

2. Related Works
object detection . There are two main types of target detection methods : First level method , Such as [45、50、56、60、80、81] And multilevel methods , Such as [5、7、9、25、27、58]. Multistage detectors are usually more flexible than primary detectors 、 More precise , But it's also more complicated . In this paper , We use a multistage detector HTC[7] As a baseline , And compared with these two kinds of detectors .
Multiscale features . Our recursive feature pyramid is based on the feature pyramid network (FPN)[44], An effective target detection system using multi-scale features . before , Many target detectors directly use multi-scale features extracted from the backbone network [4,50], and FPN The top-down path is used to sequentially combine the features of different scales .PANet[49] stay FPN Add another bottom-up path to the top of .STDL[82] The cross scale characteristics of scale conversion module are proposed .G-FRNet[1] Use the gating unit to add feedback .NAS-FPN[24] and Auto-FPN[73] Using neural structure search [87] To find the best FPN structure .EfficientDet[66] Suggest repeating a simple BiFPN layer . Unlike them , The recursive feature pyramid we proposed is enriched by a bottom-up trunk FPN The ability to express . Besides , We will use the pyramid pool of atorus space (ASPP)[13,14] Integrate to FPN in , With rich functions , Similar to seamless mini DeepLab Design [55].
Recursive convolution network . In order to solve different types of computer vision problems , Many recursive methods have been proposed , Such as [32,42,65]. lately ,CBNet[51] A recursive target detection method is proposed , It cascades multiple backbone networks , Output features as FPN The input of . by comparison , our RFP Use a that contains a valid fusion module 、 Rich in ASPP Of FPN Perform recursive computation .
The conditional convolution network adopts dynamic kernel 、 Width or depth , for example [16,39,43,48,74,77]. The difference is , We propose a switchable antitrust convolution (SAC) Without changing any pre training model , An effective conversion mechanism from standard convolution to conditional convolution . therefore ,SAC Is a plug and play module , Backbone for many pre training . Besides ,SAC Using global context information and a new weight locking mechanism , Make it more effective .
3. Recursive feature pyramid
3.1 Characteristic pyramid network


among x0 It's the input image ,fS+1=0. be based on FPN The target detector adopts fi Carry out detection and calculation .
3.2 Recursive feature pyramid

We are right. ResNet[28] Backbone network B Made changes , To allow it to accept x and R(f) As input .ResNet There are four stages , Each stage consists of several similar blocks . We only change the first block of each stage , Pictured 3 Shown . This block calculation 3 Layer features and add them to the features calculated by shortcut . In order to use features R(f), We added another convolution layer , Its kernel size is set to 1. The weight of this layer is initialized to 0, To ensure that loading weights from pre trained checkpoints does not have any practical effect .
3.3. ASPP as the Connecting Module

We don't have a convolution that follows the cascade feature , Because here R The final output used in intensive forecasting tasks is not generated . Be careful , Each of these four branches produces a feature , The number of channels is the of input characteristics 1/4, Connecting them will produce a connection with R.In Sec Input features of the same size .5, We showed with and without ASPP Modular RFP Performance of .
3.4 The output of the fusion module is updated

4. Switchable Atrous Convolution


边栏推荐
猜你喜欢

自定义一个对象

锁相环工作原理,比如我们8MHZ晶振如何让MCU工作在48MHZ或者72MHZ呢

anchor free yolov1

O3DF执行董事Royal O’Brien:开源没有边界,所有共享的声音都会变成实际方向

动态规划背包问题之01背包详解

FIO performance testing tool

The protection circuit of IO port inside the single chip microcomputer and the electrical characteristics of IO port, and why is there a resistor in series between IO ports with different voltages?

Chen Wei, head of CPU technology ecology of Alibaba pingtouge: the development road of pingtouge

YOLOV7

UiPath Studio Enterprise 22.4 Crack
随机推荐
动态规划背包问题之01背包详解
[taro] applet picker dynamically obtains data
Scale Match for Tiny Person Detection
低佣金账户怎么开?安全吗?
[note] linear regression
机器狗背冲锋枪射击视频火了,网友瑟瑟发抖:stooooooooppppp!
VMware platform STS certificate expired
20220721挨揍内容
动态规划背包问题之多重背包详解
FreeRTOS个人笔记-挂起/解挂任务
Several common SQL misuses in MySQL
Oralce中实现将指定列的指定内容替换为想要的内容
Oracle中实现删除指定查询条件的所有数据
(已解决)idea编译Gradle项目提示 错误找不到符号
Sealing agent glycerol sealing neutral resin sealing dosage form
PgSQL mistakenly deletes PG_ Wal file, service startup failed
pytest接口自动化测试框架 | 控制测试用例执行
阿里平头哥CPU技术生态负责人陈炜:平头哥的发展之路
Priyanka Sharma, general manager of CNCF Foundation: read CNCF operation mechanism
Cloudcompare & PCL normal vector space sampling (NSS)