当前位置:网站首页>Angled detection frame | calibrated depth feature for target detection (with implementation source code)
Angled detection frame | calibrated depth feature for target detection (with implementation source code)
2022-07-03 00:06:00 【Tom Hardy】
Click on the above “3D Visual workshop ”, choice “ Star standard ”
The dry goods arrive at the first time
The author 丨 Edison_G
Source: Institute of computer vision
In the last ten years , Significant progress has been made in target detection , These targets are usually distributed in large-scale changes and arbitrary directions . However , Most existing methods rely on having different scales 、 Heuristic definition of angle and aspect ratio anchor, Usually in anchor boxes and axis-aligned There is a serious misalignment between the convolution features of , This leads to a common inconsistency between classification scores and positioning accuracy .
One 、 Briefly
To solve this problem , Some researchers have proposed a Single-shot Alignment Network(S2A-Net), It consists of two modules : A feature alignment module (FAM) And a directional detection module (ODM).FAM Can pass anchor Optimize the network to generate high-quality anchor, And according to anchor boxes With the newly proposed alignment convolution, it adapts to the characteristics of alignment convolution .ODM First, take the initiative active rotating filters Encode the direction information , Then it produces direction sensitive and direction invariant features , To alleviate the inconsistency between classification score and positioning accuracy . Besides , Researchers have also further explored methods for detecting objects in large-scale images , Thus, a better trade-off is made between speed and accuracy .
A lot of experiments show that , The new method can be used in two commonly used data sets (DOTA and HRSC2016) Achieve state-of-the-art performance , While maintaining high efficiency !
Two 、 background
And based on R-CNN Compared with the detector of ,one-stage The detector returns to the bounding box , And directly use conventional and intensive sampling anchor Classify them . This architecture has high computational efficiency , But it often lags behind [G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “DOTA: A large-scale dataset for object detection in aerial images,” in CVPR, 2018]. As shown in the figure below , Think one-stage The detector is seriously misaligned .
Heuristically defined anchor Low quality , The target cannot be overwritten , Lead to goals and anchor Dislocation between . for example , The widening ratio of the bridge is usually 1/3 To 1/30 Between , Only a few anchor Not even anchor You can calibrate . This dislocation usually aggravates the imbalance of foreground and background , Hinder performance .
The convolution feature of the backbone network is usually aligned with the fixed receiving field , The targets in aerial images are distributed in arbitrary directions and different appearances . Even if it's anchor boxes Assigned to instances with high reliability ( namely IoU),anchor boxes There is still dislocation with convolution features . let me put it another way ,anchor boxes To some extent, it is difficult to express the whole goal . result , The final classification score cannot accurately reflect the positioning accuracy , This also hinders the detection performance in the post-processing stage ( Such as NMS).
Performance comparison of different methods under the same setting
3、 ... and 、 New framework analysis
RetinaNet as Baseline
Be careful ,RetinaNet Designed for general object detection , Output Horizontal bounding box ( Here's the picture (a)) Expressed as :
In order to be compatible with object-oriented detection , The researchers replaced the directional bounding box RetinaNet Regression output of . Pictured above (b), Expressed as :
In fact, an angle parameter is added , angle θ Range [-π/4,3π/4].
Alignment Convolution
Directly above , Look at the picture .
As shown in the figure . The standard convolution sample on the characteristic graph through the regular grid .DeformConv Learn an offset field to increase the spatial sampling location . However , It may sample in the wrong place , Especially for objects with dense packaging . The researchers put forward AlignConv By adding an additional offset field anchor boxes Guide to extract the characteristics of grid distribution . And DeformConv Different ,AlignConv The offset field in is directly from anchor boxes Inferred from . The example above .(c) and (d) Illustrates the AlignConv Can be in anchor boxes Extract accurate features from .
(a) It is a standard two-dimensional convolution of conventional sampling positions ( Green dot ).(b) by Deformable Convolution, With deformable sampling position ( Blue dot ).(c) and (d) It is proposed by researchers to have horizontal and rotating anchor boxes(AB) Two examples of ( Orange rectangle ), The blue arrow indicates the offset field .
Feature Alignment Module (FAM)
Adopt input characteristics and anchor forecast . Map to input , And generate aligned features .
Oriented Detection Module (ODM)
ODM To alleviate the inconsistency between classification score and positioning accuracy , Then carry out accurate target detection .
Four 、 Experiment and visualization
Different RETINANET stay DOTA The result on the dataset
Researchers cut a large image into 1024×1024chip Images , In steps of 824. Combine the large image with chip Image input to the same network , The test results can be generated without resizing ( For example, the plane in the red box ).
Different methods in DOTA The result on the dataset
stay HRSC2016 Test results on the data
The test results in the above figure , Promising research in military or dock , The wharf can reduce the work of manual scheduling , The military can attack and counterattack more accurately . As a technology contributed before :
Computer vision to see the Suez Canal blockage ( Ship inspection )
This article is only for academic sharing , If there is any infringement , Please contact to delete .
3D Visual workshop boutique course official website :3dcver.com
1. Multi sensor data fusion technology for automatic driving field
2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection !( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction : Principle analysis 、 Code explanation 、 Optimization and improvement
4. China's first point cloud processing course for industrial practice
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation
6. Thoroughly understand the vision - inertia SLAM: be based on VINS-Fusion The class officially started
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM: Source code analysis to algorithm optimization
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)
10. Monocular depth estimation method : Algorithm sorting and code implementation
11. Deployment of deep learning model in autopilot
12. Camera model and calibration ( Monocular + Binocular + fisheye )
13. blockbuster ! Four rotor aircraft : Algorithm and practice
14.ROS2 From entry to mastery : Theory and practice
15. The first one in China 3D Defect detection tutorial : theory 、 Source code and actual combat
blockbuster !3DCVer- Academic paper writing contribution Communication group Established
Scan the code to add a little assistant wechat , can Apply to join 3D Visual workshop - Academic paper writing and contribution WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .
meanwhile You can also apply to join our subdivided direction communication group , At present, there are mainly 3D Vision 、CV& Deep learning 、SLAM、 Three dimensional reconstruction 、 Point cloud post processing 、 Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation Wait for wechat group .
Be sure to note : Research direction + School / company + nickname , for example :”3D Vision + Shanghai Jiaotong University + quietly “. Please note... According to the format , Can be quickly passed and invited into the group . Original contribution Please also contact .
▲ Long press and add wechat group or contribute
▲ The official account of long click attention
3D Vision goes from entry to mastery of knowledge : in the light of 3D In the field of vision Video Course cheng ( 3D reconstruction series 、 3D point cloud series 、 Structured light series 、 Hand eye calibration 、 Camera calibration 、 laser / Vision SLAM、 Automatically Driving, etc )、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Question answer Carry out deep cultivation in five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal :
Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days
There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently
Feel useful , Please give me a compliment ~
边栏推荐
- S12. Verify multi host SSH mutual access script based on key
- Interface switching based on pyqt5 toolbar button -2
- Digital collection trading website domestic digital collection trading platform
- Data set - fault diagnosis: various data and data description of bearings of Western Reserve University
- In February 2022, the ranking list of domestic databases: oceanbase regained its popularity with "three consecutive increases", and gaussdb is expected to achieve the largest increase this month
- 67页新型智慧城市整体规划建设方案(附下载)
- Hit the industry directly! The propeller launched the industry's first model selection tool
- 67 page overall planning and construction plan for a new smart city (download attached)
- JDBC Exercise case
- Intranet penetration | teach you how to conduct intranet penetration hand in hand
猜你喜欢
Digital twin visualization solution digital twin visualization 3D platform
2022 latest and complete interview questions for software testing
Interface switching based on pyqt5 toolbar button -1
Difference between NVIDIA n card and amda card
How can cross-border e-commerce achieve low-cost and steady growth by laying a good data base
CADD课程学习(4)-- 获取没有晶体结构的蛋白(SWISS-Model)
带角度的检测框 | 校准的深度特征用于目标检测(附实现源码)
MySQL advanced learning notes (III)
附加:token;(没写完,别看…)
Highly available cluster (HAC)
随机推荐
Intranet penetration | teach you how to conduct intranet penetration hand in hand
MySQL基础
How much do you know about synchronized?
Hit the industry directly! The propeller launched the industry's first model selection tool
Open source | Wenxin big model Ernie tiny lightweight technology, which is accurate and fast, and the effect is fully open
MySQL Foundation
What experience is there only one test in the company? Listen to what they say
返回二叉树中最大的二叉搜索子树的大小
JDBC Exercise case
Agnosticism and practice makes perfect
直击产业落地!飞桨重磅推出业界首个模型选型工具
How much do you know about synchronized?
Load balancing cluster (LBC)
Use redis to realize self increment serial number
JDBC练习案例
Interface switching based on pyqt5 toolbar button -2
RTP 接发ps流工具改进(二)
zhvoice
Leetcode relaxation question - day of the week
Digital collection trading website domestic digital collection trading platform