当前位置：网站首页>[target detection] |dive detector into box for object detection new training method based on fcos

[target detection] |dive detector into box for object detection new training method based on fcos

2022-06-12 21:26:00 【rrr2】

ECCV2020 The paper
Address of thesis ：https://arxiv.org/abs/2007.14350

Problem description

In the field of target detection , Despite the success of the no anchor box , But the positioning accuracy is insufficient .

Problem cause analysis

1 The semantics of the central key point and the target are inconsistent . In the current anchor-free In the method , The central key point is very important , But as shown in the picture 1 Shown , The central key point area corresponding to the target is more irrelevant background , This will inevitably Noise pixels are taken as positive samples . If Use this simple strategy to define positive sample pixels , Certainly Leading to obvious semantic inconsistencies , This leads to a decline in the accuracy of regression .
Insert picture description here
2 The regression of local features has limitations . Because of the limited size of convolution kernel , Every The effective perception domain corresponding to the central key point Probably Only part of the target information is covered , Using only the key points bbox Regression can cause performance degradation . Pictured 2 Shown , The dotted line prediction box is the result of the center point prediction , Each box is not perfectly aligned to the target .

Insert picture description here

This method

This paper proposes a method based on FCOS A new target detection algorithm DDBNet, The main innovation lies in box Decomposing and reorganizing modules (D&R, decomposition and recombination) and Semantic consistency module (semantic consistency), It is used to solve the problem of inaccuracy of the central key point and the semantic inconsistency between the central key point and the target , The result is shown in Fig. 2 Solid wireframe in .
D&R modular , Decompose multiple prediction boxes into multiple boundaries , And then combine it into a new prediction box , These boundaries are connected behind the regression branch . Combine the original prediction box for accurate training , This module removes from the prediction . In the training phase , Once the bounding box prediction is regressed at each pixel ,D&R The module decomposes each bounding box into four directions . then , Sort the boundaries of the same kind according to the actual boundary deviation from the ground true value . therefore , By regrouping ranking boundaries , More accurate box predictions can be expected , And then through IoU Loss optimization box forecast 【30】.
Semantic consistency module , According to the classification score and intrinsic importance of pixels , It is adaptively classified into positive sample pixel and sub sample pixel . The framework introduces a new branch , That is, to estimate semantic consistency rather than centrality , And it is optimized under the supervision of the semantic consistency module . The module uses an adaptive filtering strategy based on classification and regression branch output .

Main contributions ：

be based on anchor-free Architecture proposes a new target detection algorithm DDBNet, can Well solve the regression problem of the central key point as well as Semantic consistency of central key points .
Verify the central key point and GT Semantic consistency of , It can help to improve the convergence of the target detection network .
DDBNet Can achieve SOTA precision (45.5%), And it can be efficiently extended to other anchor-free In the detector .