当前位置:网站首页>Deep learning - bounding box prediction
Deep learning - bounding box prediction
2022-06-30 07:44:00 【Hair will grow again without it】
Bounding Box
Why predict ?
The last blog explained the convolution implementation of sliding window method , This algorithm is more efficient , But there are still problems , Can't output the most accurate bounding box , In the sliding window method , You take these discrete sets of positions , Then run the classifiers on them , under these circumstances , None of these bounding boxes perfectly match the car's position .
One of the algorithms to get a more accurate bounding box is
YOLO Algorithm,YOLO(You only look once) It means you only watch it once , This is from Joseph Redmon,Santosh Divvala,Ross Girshick and Ali Farhadi Proposed algorithm .
That's what it does , For example, your input image is 100×100 Of , Then put a grid on the image . In order to introduce it more simply , I use 3×3 grid , The actual implementation will use a finer grid , May be 19×19. The basic idea is to use image classification and location algorithm , Apply algorithm to 9 On a grid .( The basic idea is , use Image classification and location algorithm , One by one in the image of 9 In a grid .) A little bit more specific , You need to define the training label like this , So for 9 Each of the cells is assigned a label 𝑦,𝑦 yes 8 Dimensional , As you saw before .
Let's see. Upper left grid , Here it is. , There is nothing in it , So the label vector of the upper left grid 𝑦 yes [ 0???] . Then the output label of this grid 𝑦 Is the same , This lattice ( Number 3), There are other squares that don't have anything .
More specifically , This picture has Two objects ,YOLO What the algorithm does is , Take the midpoint of two objects , Then assign the object to the grid containing the midpoint of the object . So even if the central grid ( Number 5) Part of two cars at the same time , We pretend that the central grid doesn't have any objects we're interested in , So for the central grid , Category labels 𝑦 It's similar to this vector , It's similar to this vector without an object , namely 𝑦 = [ 0???] .
The grid of the green wireframe column and the grid of the orange wireframe column contain the midpoint of the object , The corresponding vectors are the vectors written by the rightmost green pen and the blue pen ,𝑝𝑐 = 1, Then you write 𝑏𝑥、𝑏𝑦、𝑏ℎ and 𝑏𝑤 To specify the bounding box location , And then there are categories 1 It's pedestrians , that 𝑐1 = 0, Category 2 It's a car , therefore 𝑐2 = 1, Category 3 It's a motorcycle , Then the value 𝑐3 = 0.
So for here 9 Any one of the squares , You'll get one 8 Dimension output vector , Because this is 3×3 The grid of , So there is 9 Lattice , The total output size is 3×3×8, So the target output is 3×3×8.
So this algorithm is The advantage is that the neural network can output accurate bounding boxes , So when it comes to testing , What you do is feed the input image 𝑥, Then run forward to spread , Until you get this output 𝑦.
Notice how to allocate the grid where the object is located
The process of assigning objects to a lattice is , You look at the midpoint of the object , And then assign this object to the grid where the middle point is , So even if the object can span more than one grid , It will only be assigned to 9 One of the squares , Namely 3×3 One of the squares of the network , perhaps 19×19 One of the squares of the network . stay 19×19 In Grid , The midpoint of two objects ( The blue dot in the picture shows ) The probability of being in the same lattice is lower .
advantage
- It explicitly outputs the bounding box coordinates , So this allows the neural network to output the bounding box , It can have any aspect ratio , And can
Output more accurate coordinates , It is not limited by the step size of sliding window classifier .- This is a convolution implementation , You are not in 3×3 Running on the grid 9 Sub algorithm , perhaps , If you're using a 19×19 The grid of ,19 The square is 361 Time , So you don't have to run the same algorithm 361 Time . contrary , This is a single convolution implementation , But you use a convolution network , There are many shared computing steps , Dealing with this 3×3 Many computing steps in computing are shared , Or your 19×19 The grid of , So this algorithm is very efficient .
- Because this is a convolution implementation , In fact, it runs very fast , It can achieve real-time identification .
边栏推荐
- C. Fishingprince Plays With Array
- Commands and permissions for directories and files
- 深度学习——LSTM
- 25岁,从天坑行业提桶跑路,在经历千辛万苦转行程序员,属于我的春天终于来了
- July 30, 2021 [wgs/gwas] - whole genome analysis process (Part I)
- Projection point of point on line
- Cross compile opencv3.4 download cross compile tool chain and compile (3)
- 深度学习——Bounding Box预测
- Examen final - notes d'apprentissage PHP 3 - Déclaration de contrôle du processus PHP
- NMOS model selection
猜你喜欢

深度学习——GRU单元

Final review -php learning notes 7-php and web page interaction

期末复习-PHP学习笔记6-字符串处理

深度学习——BRNN和DRNN

Examen final - notes d'apprentissage PHP 3 - Déclaration de contrôle du processus PHP

Processes, jobs, and services

Analysis of cross clock transmission in tinyriscv

为什么大学毕业了还不知道干什么?

深度学习——LSTM

Wangbohua: development situation and challenges of photovoltaic industry
随机推荐
Virtual machine VMware: due to vcruntime140 not found_ 1.dll, unable to continue code execution
Recurrence relation (difference equation) -- Hanoi problem
Examen final - notes d'apprentissage PHP 6 - traitement des chaînes
2021.11.20 [reading notes] | differential variable splicing events and DTU analysis
December 4, 2021 [metagenome] - sorting out the progress of metagenome process construction
min_ max_ Gray operator understanding
Basic knowledge points
STM32 infrared communication 2
Parameter calculation of deep learning convolution neural network
Analysis of cross clock transmission in tinyriscv
Tencent and Fudan University "2021-2022 yuan universe report" with 102 yuan universe collections
July 30, 2021 [wgs/gwas] - whole genome analysis process (Part I)
Basic operation command
How to batch modify packaging for DXP schematic diagram
342 maps covering exquisite knowledge, one of which is classic and pasted on the wall
期末复习-PHP学习笔记4-PHP自定义函数
深度学习——BRNN和DRNN
6月底了,可以开始做准备了,不然这么赚钱的行业就没你的份了
February 14, 2022 [reading notes] - life science based on deep learning Chapter 2 Introduction to deep learning (Part 1)
Proteus catalog component names and Chinese English cross reference table


