当前位置：网站首页>Angled detection frame | calibrated depth feature for target detection (with implementation source code)

Angled detection frame | calibrated depth feature for target detection (with implementation source code)

2022-07-03 00:06:00 【Tom Hardy】

Click on the above “3D Visual workshop ”, choice “ Star standard ”

The dry goods arrive at the first time

The author 丨 Edison_G

Source: Institute of computer vision

In the last ten years , Significant progress has been made in target detection , These targets are usually distributed in large-scale changes and arbitrary directions . However , Most existing methods rely on having different scales 、 Heuristic definition of angle and aspect ratio anchor, Usually in anchor boxes and axis-aligned There is a serious misalignment between the convolution features of , This leads to a common inconsistency between classification scores and positioning accuracy .

One 、 Briefly

To solve this problem , Some researchers have proposed a Single-shot Alignment Network(S2A-Net), It consists of two modules ： A feature alignment module (FAM) And a directional detection module (ODM).FAM Can pass anchor Optimize the network to generate high-quality anchor, And according to anchor boxes With the newly proposed alignment convolution, it adapts to the characteristics of alignment convolution .ODM First, take the initiative active rotating filters Encode the direction information , Then it produces direction sensitive and direction invariant features , To alleviate the inconsistency between classification score and positioning accuracy . Besides , Researchers have also further explored methods for detecting objects in large-scale images , Thus, a better trade-off is made between speed and accuracy .

A lot of experiments show that , The new method can be used in two commonly used data sets (DOTA and HRSC2016) Achieve state-of-the-art performance , While maintaining high efficiency ！

Two 、 background

And based on R-CNN Compared with the detector of ,one-stage The detector returns to the bounding box , And directly use conventional and intensive sampling anchor Classify them . This architecture has high computational efficiency , But it often lags behind [G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “DOTA: A large-scale dataset for object detection in aerial images,” in CVPR, 2018]. As shown in the figure below , Think one-stage The detector is seriously misaligned .

Heuristically defined anchor Low quality , The target cannot be overwritten , Lead to goals and anchor Dislocation between . for example , The widening ratio of the bridge is usually 1/3 To 1/30 Between , Only a few anchor Not even anchor You can calibrate . This dislocation usually aggravates the imbalance of foreground and background , Hinder performance .
The convolution feature of the backbone network is usually aligned with the fixed receiving field , The targets in aerial images are distributed in arbitrary directions and different appearances . Even if it's anchor boxes Assigned to instances with high reliability （ namely IoU）,anchor boxes There is still dislocation with convolution features . let me put it another way ,anchor boxes To some extent, it is difficult to express the whole goal . result , The final classification score cannot accurately reflect the positioning accuracy , This also hinders the detection performance in the post-processing stage （ Such as NMS）.

Performance comparison of different methods under the same setting

3、 ... and 、 New framework analysis

RetinaNet as Baseline

Be careful ,RetinaNet Designed for general object detection , Output Horizontal bounding box （ Here's the picture (a)） Expressed as ：

In order to be compatible with object-oriented detection , The researchers replaced the directional bounding box RetinaNet Regression output of . Pictured above (b), Expressed as ：

In fact, an angle parameter is added , angle θ Range [-π/4,3π/4].

Alignment Convolution

Directly above , Look at the picture .

As shown in the figure . The standard convolution sample on the characteristic graph through the regular grid .DeformConv Learn an offset field to increase the spatial sampling location . However , It may sample in the wrong place , Especially for objects with dense packaging . The researchers put forward AlignConv By adding an additional offset field anchor boxes Guide to extract the characteristics of grid distribution . And DeformConv Different ,AlignConv The offset field in is directly from anchor boxes Inferred from . The example above .(c) and (d) Illustrates the AlignConv Can be in anchor boxes Extract accurate features from .

(a) It is a standard two-dimensional convolution of conventional sampling positions （ Green dot ）.(b) by Deformable Convolution, With deformable sampling position （ Blue dot ）.(c) and (d) It is proposed by researchers to have horizontal and rotating anchor boxes(AB) Two examples of （ Orange rectangle ）, The blue arrow indicates the offset field .

Feature Alignment Module (FAM)

Adopt input characteristics and anchor forecast . Map to input , And generate aligned features .

Oriented Detection Module (ODM)

ODM To alleviate the inconsistency between classification score and positioning accuracy , Then carry out accurate target detection .

Four 、 Experiment and visualization

Different RETINANET stay DOTA The result on the dataset

Researchers cut a large image into 1024×1024chip Images , In steps of 824. Combine the large image with chip Image input to the same network , The test results can be generated without resizing （ For example, the plane in the red box ）.

Different methods in DOTA The result on the dataset

stay HRSC2016 Test results on the data

The test results in the above figure , Promising research in military or dock , The wharf can reduce the work of manual scheduling , The military can attack and counterattack more accurately . As a technology contributed before ：

Computer vision to see the Suez Canal blockage （ Ship inspection ）

This article is only for academic sharing , If there is any infringement , Please contact to delete .

3D Visual workshop boutique course official website ：3dcver.com

1. Multi sensor data fusion technology for automatic driving field

2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection ！( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction ： Principle analysis 、 Code explanation 、 Optimization and improvement
4. China's first point cloud processing course for industrial practice
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation
6. Thoroughly understand the vision - inertia SLAM： be based on VINS-Fusion The class officially started
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM: Source code analysis to algorithm optimization
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)

9. Build a set of structured light from zero 3D Rebuild the system [ theory + Source code + practice ]

10. Monocular depth estimation method ： Algorithm sorting and code implementation

11. Deployment of deep learning model in autopilot

12. Camera model and calibration ( Monocular + Binocular + fisheye ）

13. blockbuster ！ Four rotor aircraft ： Algorithm and practice

14.ROS2 From entry to mastery ： Theory and practice

15. The first one in China 3D Defect detection tutorial ： theory 、 Source code and actual combat

blockbuster ！3DCVer- Academic paper writing contribution Communication group Established

Scan the code to add a little assistant wechat , can Apply to join 3D Visual workshop - Academic paper writing and contribution WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .

meanwhile You can also apply to join our subdivided direction communication group , At present, there are mainly 3D Vision 、CV& Deep learning 、SLAM、 Three dimensional reconstruction 、 Point cloud post processing 、 Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation Wait for wechat group .

Be sure to note ： Research direction + School / company + nickname , for example ：”3D Vision + Shanghai Jiaotong University + quietly “. Please note... According to the format , Can be quickly passed and invited into the group . Original contribution Please also contact .

▲ Long press and add wechat group or contribute

▲ The official account of long click attention

3D Vision goes from entry to mastery of knowledge ： in the light of 3D In the field of vision Video Course cheng （ 3D reconstruction series 、 3D point cloud series 、 Structured light series 、 Hand eye calibration 、 Camera calibration 、 laser / Vision SLAM、 Automatically Driving, etc ）、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Question answer Carry out deep cultivation in five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal ：

Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days

There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently

Feel useful , Please give me a compliment ~

原网站

版权声明
本文为[Tom Hardy]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/183/202207022245388588.html

当前位置：网站首页>Angled detection frame | calibrated depth feature for target detection (with implementation source code)

Angled detection frame | calibrated depth feature for target detection (with implementation source code)

边栏推荐

猜你喜欢

随机推荐