当前位置:网站首页>CV learning notes cnn-vgg
CV learning notes cnn-vgg
2022-06-09 07:56:00 【Moresweet】
CNN-VGG
1 . The process of image recognition
** Get the original information :** The external information obtained through the sensor ( Such as images ) Into signals that the computer can process .
** Preprocessing :** Translate the image 、 rotate 、 De noise … operation , The purpose is to enhance the features of interest in the image .
** Feature extraction and feature selection :** In pattern recognition , Feature extraction and selection are required . Feature extraction and selection is one of the key technologies in image recognition .
** Classifier design :** It refers to a recognition rule obtained through training , A feature classification can be obtained by this recognition rule , Make the image recognized
Technology can get high recognition rate . Classification decision is to classify the identified objects in the feature space , So as to better identify the
Which class does the object belong to .
2. Classification and testing
classification : Classification is a given picture , Be able to complete the work of determining it as a certain category .
testing : Detection is to find the objects of a given category in the picture , And give the work of the existing area in the picture .

3. VGG neural network
common CNN( Convolutional neural networks )
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-rdYSq7Rs-1653013712403)(imgs/image-20220520094651788.png)]](/img/f1/9e443df010557ec59f68ea34365e06.png)
I mentioned in my last blog AlexNet It's the most classic CNN Model , and VGG Is based on AlexNet Improved network model , These branches developed later ( Different color blocks have different branches ) Each has its own optimization characteristics , and VGG Is characterized by The network is deep , Reached 16-19 layer , And a small convolution kernel is used (3x3)
VGG19( The number represents the number of layers , Each layer contains a certain convolution 、 Pooling 、 Full connection and other operations ) as follows
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-J7yGoyZU-1653013712404)(imgs/image-20220520095035639.png)]](/img/0e/4ce3b9c0e4be37fe1e92be393fab84.png)
Let's say VGG16 Take an example for analysis :
vgg16 The network layer with parameters in the network structure has 16 layer , namely 13 Convolution layers ,5 A pool layer ,3 All connection layers , The active layer... Is not included .( notes : The pooling layer does not contain parameters , so VGG16 Of 16 The calculation does not include nonparametric layers , Only the convolution layer and the full connection layer are calculated )
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-l9Frcfah-1653013712404)(imgs/image-20220520101401740.png)]](/img/c9/96a951f3595c2ef9c8cc386695b572.png)
Example :
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-QGNBJqzr-1653013712405)(imgs/image-20220520095138799.png)]](https://img-blog.csdnimg.cn/e0a56fe6daa9442bb17c6d08c03286bf.png)
Layer by layer analysis :
Enter the picture as (224,224,3) That's resolution 224x224 Three channel picture information , If the picture size is different, you need resize
Channel number calculation formula :N=(W-F+2P)/S + 1
conv1 two [3,3] Convolution network , The output feature layer is 64, Output is (224,224,64), Again 2X2 Maximum pooling , Output net by
(112,112,64).
conv2 two [3,3] Convolution network , The output feature layer is 128, Output net by (112,112,128), Again 2X2 Maximum pooling , Output
net by (56,56,128).
conv3 Three times [3,3] Convolution network , The output feature layer is 256, Output net by (56,56,256), Again 2X2 Maximum pooling , Output net
by (28,28,256).
conv3 Three times [3,3] Convolution network , The output feature layer is 256, Output net by (28,28,512), Again 2X2 Maximum pooling , Output net
by (14,14,512).
conv3 Three times [3,3] Convolution network , The output feature layer is 256, Output net by (14,14,512), Again 2X2 Maximum pooling , Output net
by (7,7,512).
Use convolution to simulate the full connection layer , The effect is equal to , Output net by (1,1,4096). Two times in total .
Use convolution to simulate the full connection layer , The effect is equal to , Output net by (1,1,1000).The final output is the prediction of each class .
VGG The optimization strategy of
Use convolution layer instead of full connection layer
The calculation method of the full connection layer is actually that the convolution kernel size is feature map Convolution of size , The way to use convolution layer instead of full connection layer is to set convolution kernel as the size of input space .
Why do you do this , Because a lot of computer resources ( Such as CUDA) The convolution operation is processed , Can speed up , and FC But not , Therefore, it can improve efficiency .
for example VGG16 The first fully connected layer input is 7x7x512, Output is 1x1x4096, This can be done with a convolution kernel size 7x7, step
(stride) by 1, No filling (padding), Number of output channels 4096 The convolution layer equivalent representation of , The output of 1x1x4096, And the whole company
The connection layer is equivalent . The following full connection layer can use 1x1 Convolution equivalent substitution .
1x1 Convolution
1x1 Convolution can increase and reduce the dimension of the feature channel , In this way, the number of convolution kernels is the number of output channels , Compared with the pool layer, the number of channels cannot be changed , It can only achieve feature map Dimensionality reduction of virtual reality .
Personal study notes , Only exchange learning , Reprint please indicate the source !
边栏推荐
- R language uses dlnorm function to generate lognormal distribution density data and plot function to visualize lognormal distribution density data
- Confusing output from infinite recursion within try catch
- Container deployment and serverless deployment
- Mechanical keyboard shaft (red shaft, tea shaft, black shaft, green shaft)
- 【6月第一周学习记录】UU-Computer vision(1):3D reconstruction&Camera calibration
- Dbutil auxiliary class, manual commit transaction, metadata
- Advanced SQL query
- "Sprint to Dachang foundation 1"
- Google browser F12 (developer tool) -- function introduction
- C language review 11
猜你喜欢

Use of thread pool

2022制冷与空调设备安装修理特种作业证考试题库及答案

At time_ What happens to TCP connections in wait status after SYN is received?

Question bank and answers of 2022 special operation certificate for installation and repair of refrigeration and air conditioning equipment

Heavyweight! Wuhan university students made a successful satellite launch!

Apache configuration and application (building web host, log segmentation and awstats analysis system)

Oracle: subquery, sorting

C语言复习11

ftp服务

Apache Web page and security optimization
随机推荐
Summary of MySQL knowledge points
R language uses dlnorm function to generate lognormal distribution density data and plot function to visualize lognormal distribution density data
ehcache
2022年中式烹调师(初级)考试题库及在线模拟考试
Compiling opencv4.5.5 with CUDA (4.2.0+cuda11.1+cudnn8.0.5 failed)
2022广东省安全员C证第三批(专职安全生产管理人员)考题及模拟考试
蓝桥杯电子类单片机第十一届决赛试题
Question bank and answers of 2022 special operation certificate for installation and repair of refrigeration and air conditioning equipment
SQL or NoSQL, you will understand after reading this article
[school experiment + Blue Bridge Cup topic] water connection problem: there is a water room in the school. There are m taps in the water room for students to turn on water. Each tap has the same water
Sqlzoo question brushing record-2
C语言复习9
The latest Shanxi construction safety officer simulation question bank and answers in 2022
Sql Or NoSql,看完这一篇你就懂了
Market Research - current situation and future development trend of ethylene absorbent package market in the world and China
Robot_ Framework: Variables
2022 Chinese cook (elementary) examination question bank and online simulation examination
MySQL: connection query
Will ebpf be the future of service grid?
Talk about the ten mistakes often made in implementing data governance