当前位置:网站首页>Deep learning - goal orientation
Deep learning - goal orientation
2022-06-30 07:43:00 【Hair will grow again without it】
target location
We are familiar with the task of image classification , The algorithm traverses the image , Judge whether the object is a car , This is it.
Image classification. In this lesson, we will learn another problem of constructing neural networks , That is, the problem of location and classification . It means , We It is not only necessary to use the algorithm to judge whether the picture is a car , And mark its position in the picture , Circle the car with a border or red box , This is it.Locate the classification problem. among “ location ” It means to judge the specific position of the car in the picture .
You are no stranger to the problem of image classification , for example , Input a picture into a multilayer convolutional neural network . This is called convolutional neural network , It will output an eigenvector , And feedback to softmax Unit to predict the picture type . If you are building a car auto drive system , Then objects may include the following categories : Pedestrians 、 automobile 、 Motorcycle and background , This means that the first three objects are not included in the picture , That is to say, there are no pedestrians in the picture 、 Cars and motorcycles , The output will be a background object , These four categories are softmax The possible output of the function .
This is the standard classification process , If you also want to locate the car in the picture , How do you do that ? We The neural network can output several more units , Output a bounding box . Specifically, let the neural network output more 4 A digital , Marked as𝑏𝑥,𝑏𝑦,𝑏ℎand𝑏𝑤, These four numbers are the parametric representation of the bounding box of the detected object .
Let's start by agreeing on the symbols that will be used in this week's course , picture The coordinates of the upper left corner are (0,0), The lower right corner is marked with (1,1). To determine the exact location of the bounding box , You need to specify a red square Center point , This point is expressed as **(𝑏𝑥,𝑏𝑦)**, The height of the bounding box is 𝑏ℎ, Width is 𝑏𝑤. Therefore, the training set contains not only the object classification labels to be predicted by the neural network , Also include the four numbers that represent the bounding box , Then we use supervised learning algorithm , Output a category label , There are also four parameter values , Thus, the frame position of the detected object is given .
Q: How to define goal tags for supervised learning tasks ?
A: Please note that , There are four categories , The output of the neural network is the four numbers and a classification label , Or the probability of the occurrence of classification labels . Target tag 𝑦 Is defined as follows :
It's a vector , The first component𝑝𝑐Express Whether it contains objects , If the object belongs to the first three categories ( Pedestrians 、 automobile 、 The motorcycle ), be 𝑝𝑐 = 1, If it's the background , Then there is no object to be detected in the picture , be 𝑝𝑐 = 0. We can think of it this way 𝑝𝑐, It represents the probability that the detected object belongs to a certain classification , Except for background classification . If an object is detected , Output the bounding box parameters of the detected object𝑏𝑥、𝑏𝑦、𝑏ℎand𝑏𝑤. Last , If there is an object , that 𝑝𝑐 = 1, At the same time output𝑐1、𝑐2and𝑐3, Indicates that the object belongs to 1-3 What kind of class , It's pedestrians , Car or motorcycle .
example
Let's assume that the picture contains only one object , So for this classification and positioning problem , At most one of the objects will appear in the picture .
Suppose this is a picture of a training set , Marked as 𝑥, Pictured above Car pictures . And in the 𝑦 among , First element 𝑝𝑐 = 1, Because there is a car in the picture ,𝑏𝑥、𝑏𝑦、𝑏ℎ and 𝑏𝑤 Will indicate the location of the bounding box , So the label training set needs the boundary box of the label . In the picture is a car , So the result Belong to the category 2, Because the target is not a pedestrian or motorcycle , It's a car , therefore 𝑐1 = 0,𝑐2 = 1,𝑐3 = 0,𝑐1、𝑐2 and 𝑐3 At most one of them is equal to 1.
This is the case when there is only one detection object in the picture , What if there is no detected object in the picture ? What if the training sample is such a picture ?
In this case ,𝑝𝑐 = 0,𝑦 Other parameters of will become meaningless , Here I write it all in question marks , Express “ meaningless ” Parameters of , Because there is no detected object in the picture , So you don't have to consider the size of the bounding box in the network output , It doesn't need to consider that the object in the picture belongs to 𝑐1、𝑐2 and 𝑐3 What kind of .
For a given labeled training sample , Whether or not the picture contains a positioning object , Build input picture 𝑥 And classification labels 𝑦 The specific process is the same . These data ultimately define the training set .
Loss function of neural network
The parameter is category 𝑦 And network output 𝑦^, If the square error strategy is used , be𝐿(𝑦\^ , 𝑦) = (𝑦\^1 − 𝑦1)2 + (𝑦\^2 − 𝑦2)2 + ⋯ (𝑦\^8 − 𝑦8)2, The loss value is equal to the sum of the squares of the corresponding differences of each element .
If there is a positioning object in the picture , that𝑦1 = 1, therefore𝑦1 = 𝑝𝑐, similarly , If there is a positioning object in the picture ,𝑝𝑐 = 1, The loss value is the sum of the squares of the different elements .
The other case is ,𝑦1 = 0, That is to say𝑝𝑐 = 0, The loss value is(𝑦1^ − 𝑦1)2, Because in this case , We don't have to think about other elements , Just focus on the neural network output 𝑝𝑐 The accuracy of .
边栏推荐
- C51 minimum system board infrared remote control LED light on and off
- String application -- string violent matching (implemented in C language)
- November 22, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5, section 4, hidden Markov model)
- Binary tree related operations (based on recursion, implemented in C language)
- DXP shortcut key
- Three software installation methods
- Examen final - notes d'apprentissage PHP 3 - Déclaration de contrôle du processus PHP
- C language - student achievement management system
- How to quickly delete routing in Ad
- Calculate Euler angle according to rotation matrix R yaw, pitch, roll source code
猜你喜欢

Inversion Lemma

Processes, jobs, and services

期末复习-PHP学习笔记1
![January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)](/img/48/cfe6ab95b4d4660e3ac3d84ae5303b.jpg)
January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)

【花雕体验】14 行空板pinpong库测试外接传感器模块(之一)
![July 30, 2021 [wgs/gwas] - whole genome analysis process (Part I)](/img/37/ae0f7ca03ef564b029c9c709779231.jpg)
July 30, 2021 [wgs/gwas] - whole genome analysis process (Part I)

Efga design open source framework fabulous series (I) establishment of development environment

深度学习——卷积的滑动窗口实现

Final review -php learning notes 7-php and web page interaction

Installation software operation manual (continuous update)
随机推荐
Tue Jun 28 2022 15:30:29 GMT+0800 (中国标准时间) 日期格式化
C language implementation sequence stack
Assembly learning register
Xiashuo think tank: 125 planet updates reported today (packed with 101 meta universe collections)
Efga design open source framework fabulous series (I) establishment of development environment
期末复习-PHP学习笔记7-PHP与web页面交互
STM32 register on LED
Desk lamp control panel - brightness adjustment timer
Cadence innovus physical implementation series (I) Lab 1 preliminary innovus
你了解IP协议吗?
Installation software operation manual (continuous update)
Final review -php learning notes 3-php process control statement
Ad\dxp how to solve the problem of not knowing the schematic Library
Implementation of binary search in C language
Final review -php learning notes 1
Similarities and differences of differential signal, common mode signal and single ended signal (2022.2.14)
Cadence physical library lef file syntax learning [continuous update]
Analysys analysis: online audio content consumption market analysis 2022
Parameter calculation of deep learning convolution neural network
Self study notes -- use of 74h573


