当前位置:网站首页>Introduction to anchor free decision
Introduction to anchor free decision
2022-06-26 00:09:00 【Invincible Zhang Dadao】
1. background
Target detection starts from two_stage Time To one_stage Time , from anchor basic To anchor free, More and more refined . from 18 year CornerNet Start ,anchor free Paper jet neck explosion , Announce the beginning of anchor free Time .
2. Network
2.1 DenseBox
The work of this paper :
- Proved that simple FCN As long as the network is reasonably designed, it can be used to detect targets under different scales and severe occlusion .
- Propose new FCN Model ,DENSEBOX, No regional proposal is required , It can be used to train end-to-end network .
- Combined with the landmark localization Multi task learning makes densebox The accuracy is further improved .

Network architecture : With vggNet by backbone The Internet ,input picture(m * n * 3) -> conv ->bilinear upsample -> threshold and NMS -> output(m/4 * n/4 * 5),
1) First, design a set of End to end multitask full convolution model , Directly regress the confidence degree of the appearance of the object and its relative position .
2) At the same time, in order to better deal with objects with serious occlusion , Improve the recall rate of small objects , He introduced... Into the detection network Upper sampling layer , and Integrating shallow networks The resulting features , Get a larger output layer .
3) To screen training samples , Ensure that the positive and negative samples are balanced , Reduce false detection , He also took the lead in using Online Hard Negative Mining The strategy of , And difficult case analysis .
4) Each pixel is converted to a confidence level and to the target bounding box bbox Four distances , Then proceed NMS.
stay FCN Adding a few layers to the structure can achieve landmark localization, And then through fusion landmark heatmaps and score map It can further improve the test results .
2.2 YOLOV1
YOLOV1 As anchor free A masterpiece of (YOLOV2 And V3 All with anchor Frame network architecture )
- Input the image as 448x448x3 Color image of , after GoogLeNet Before 20 Layer for convolution output 14x14x1024 Characteristic graph
- Then it passes through four convolution layers and 2 All connection layers , Finally, it is reordered into a 7x7x30 Matrix ( tensor )

As shown in the figure above , After layers of convolution , Output 7730 The matrix of is equivalent to dividing the original image into 7x7=49 Boxes , Each frame consists of a 30 The vector of dimensions constitutes 30 Dimension vector , front 10 The two features predict two bbox(BoundingBox Regression box ), A horizontal box , A column box , Each box 5 Features . Next to the box is the category , common 20 Classes . The features are in the two boxes of probability prediction that each target belongs to a certain class , Of each box 5 Features , Namely :(1)bbox center x be relative to grid cell( The small red box in the figure ) Coordinates of (2)bbox center y be relative to grid cell Of y coordinate (3)bbox The width of (4)bbox The height of (5) Is there a goal , The goal of existence is 1xIoU Value , otherwise 0xIoU Value
Predicted x,y,w,h The value range of is all delimited to (0,1) In the open section , The conversion method is shown in the following figure
On the division of 7*7 Of 49 A cell , There will always be a 30 Dimension vector
Each cell is responsible for a single target , If the centers of two or more targets exist in a cell at the same time , Then only the category with the highest probability is saved in the cell
- LOSS Calculation

2.3 CornerNet
- Overall network architecture :backbone by hourglass The Internet , Then add two prediction modules .

Simplify :
1.1 hourglass The Internet
The principle is similar to resNet The Internet , And sampling through convolution in the early stage , Fuse with the value of subsequent upsampling , Obtain characteristic maps of different scales ,. For subsequent pooling.
1.2 corner pooling
Two feature map, Take the same position , To the first feature map Take this column as the pixel point at this position on the max pooling; For the second feature map Pixels at this position on the , The maximum value of the right row starting from it (max pooling+1), Add the two maximum values , This is the output of this position . Do this for all locations , Get a complete output, This is a complete top-left corner pooling. Empathy ,bottom-right corner pooling Is to look up and take the maximum value , Look left to get the maximum value , And then add up .
1.3 Prediction module
The output of each prediction module is divided into Heatmaps,Embedding, and offsets Three parts , Their respective function is to point out the position of the corner , Corner pairing , Deviation correction .
heatmap: Yes C individual chanel,C Is the number of target categories . No background chanel. Every chanel Are binary masks , Used to indicate the position of the corner , Yes , It is our ultimate goal to find a point .
Embedding: For corner pairing . You have a pile of top-left corners, Another pile bottom-right corners, Then where do you know who should be a couple with whom . Here is the human posture estimation , The idea of pairing joint points , Assign one for each corner Embedding, Just think of it as an identity card . The color of each object's ID card is different , Those who get the identity cards of the same color are the whole family . Here is the embedding The closest value top-left corner and bottom-right corner Make a pair to draw a frame .
offset: The offset . Why calculate this thing . In the author's experiment , Input is 511∗511( It seems that I remember ), however heatmap yes 128∗128. Enter the point on the (x,y)(x,y)(x,y) Insinuate to heatmap On , It has to be ([x∗128/511],[y∗128/511]), Don't worry about the result calculated by others , When you see the rounding symbol, you know that you have to lose precision , And then heatmap When the position found on is mapped back , That must be wrong , So there was offset(128∗128∗2,x,y1281282,x,y128∗128∗2,x,y Offset in both directions ).
The specific operation is : First pair heatmap Non maximum suppression , And then take top 100 Of top-left and top 100 Of bottom-right The corner of , And then use offset Correct the position of these corners . And then calculate top-left and bottom-right Corner point Embedding Of L1 distance , Distance greater than 0.5 Or there are different kinds of corners that do not deserve to walk into the palace of marriage hand in hand . Those who can walk into the palace of marriage will get married , This pair can be used to draw a frame .
1.4 loss function
Go on ===================================
2.4 FSAF
2.5 FCOS
2.6 FoveaBox
边栏推荐
- Final and static
- P3052 [USACO12MAR]Cows in a Skyscraper G
- 网络连接验证
- Smt贴片机工作流程
- Literature research (II): quantitative evaluation of building energy efficiency performance based on short-term energy prediction
- 贴片机供料器(feeder)飞达的种类,如何工作
- 文献调研(一):基于集成学习和能耗模式分类的办公楼小时能耗预测
- dbca静默安装及建库
- Recommended system design
- ORA-01153 :激活了不兼容的介质恢复
猜你喜欢

Hand made pl-2303hx USB to TTL level serial port circuit_ Old bear passing by_ Sina blog

11.1.1、flink概述_flink概述

EasyConnect连接后显示未分配虚拟地址

Lazy people teach you to use kiwi fruit to lose 16 kg in a month_ Old bear passing by_ Sina blog

Literature research (IV): Hourly building power consumption prediction based on case-based reasoning, Ann and PCA

猕猴桃酵素的功效_过路老熊_新浪博客

深圳台电:联合国的“沟通”之道

懒人教你用猕猴桃一月饱减16斤_过路老熊_新浪博客
![Find the minimum value of flipped array [Abstract bisection]](/img/b9/1e0c6196e6dc51ae2c48f6c5e83289.png)
Find the minimum value of flipped array [Abstract bisection]

社交网络可视化第三方库igraph的安装
随机推荐
DateTimeFormatter与LocalDateTime
SMT贴片加工pcba立碑现象的原因和解决方法
[wechat official account H5] generates a QR code with parameters to enter the official account attention page to listen to user-defined menu bar for official account events (server)
ORA-01153 :激活了不兼容的介质恢复
[reprint]rslogix 5000 instance tutorial
smt贴片加工行业pcba常见测试方法优劣分析比较
STEP7 master station and remote i/o networking_ Old bear passing by_ Sina blog
文献调研(二):基于短期能源预测的建筑节能性能定量评估
PCB生产为什么要做拼板和板边
手工制作 pl-2303hx 的USB转TTL电平串口的电路_过路老熊_新浪博客
SMT操作员是做什么的?工作职责?
DNS复习
86.(cesium篇)cesium叠加面接收阴影效果(gltf模型)
Common problems encountered when creating and publishing packages using NPM
Servlet response下载文件
10.2.3、Kylin_kylin的使用,维度必选
Literature research (III): overview of data-driven building energy consumption prediction models
兆欧表电压档位选择_过路老熊_新浪博客
Efficacy of kiwi fruit enzyme_ Old bear passing by_ Sina blog
smt贴片加工行业常见术语及知识汇总