当前位置:网站首页>Deep learning -- Realization of convolution by sliding window
Deep learning -- Realization of convolution by sliding window
2022-06-30 07:44:00 【Hair will grow again without it】
Convoluted sliding window
The transformation from the fully connected layer to the convoluted layer
In order to construct convolution application of sliding window , First of all, we need to know how to transform the full connection layer of neural network into convolution layer .
Suppose the object detection algorithm inputs a 14×14×3 Image , Here the filter size is 5×5, The number is 16,14×14×3 The image of is mapped to 10×10×16. And then by the parameter 2×2 Maximum pooling operation of , The image is reduced to 5×5×16. then Add a connection 400 Full connection layer of units , Then add a full connection layer , Finally through softmax Unit output 𝑦.
Now?The demonstration is how to convert these fully connected layers into convolution layers
Draw a convolution network like this , Its first layers are the same as before , And for the next layer , That is to say This whole connection layer , We can use 5×5 Filter to achieve , The number is 400 individual , The input image size is 5×5×16, use 5×5 It's convoluted by the filter , The filter is actually 5×5×16, Because in the convolution process , The filter will traverse this 16 Channels , So the number of channels in these two places must be the same , The output is 1×1. Hypothetical application 400 This one 5×5×16 filter , The output dimension is 1×1×400, We no longer think of it as a containing 400 Set of nodes , It is One 1×1×400 The output layer of . mathematically , It's the same as the full connection layer , Because of this 400 Each of the nodes has one 5×5×16 Filters for dimensions , So each value is a layer above these 5×5×16 The output of the activation value through an arbitrary linear function .
We Add another convolution layer , here It's using 1×1 Convolution , Suppose there is 400 individual 1×1 Filter , Here 400 Under the action of a filter , The next dimension is 1×1×400, It is actually the full connection layer in the last network . Finally through 1×1 Filter treatment , Get one softmax Activation value , Through convolution networks , We Finally get this 1×1×4 The output layer of , Not here 4 A digital
Through convolution to achieve sliding window object detection algorithm
Let's say we input... To the convolution network of the sliding window 14×14×3 Pictures of the , Same as before , The last output layer of neural network , namely softmax The output of the unit is 1×1×4.
hypothesis The size of the picture input to the convolution network is 14×14×3, The picture of the test set is 16×16×3, Now add a yellow bar to the input image , stay In the original sliding window algorithm , You're going to put this blue area into the convolution network ( Red pen mark ) Generate 0 or 1 classification . Then slide the window , The stride is 2 Pixel , Slide to the right 2 Pixel , Put this The green box area is input to the convolution network , Run the entire convolution network , Get another label 0 or 1. Continue to input the orange area to the convolution network , After convolution, we get another label , Finally, the purple area at the bottom right is convoluted for the last time . Here we are 16×16×3 Slide the window on the small image , Convolution network is running 4 Time , So I output 4 A label .
Final , In the output layer 4 In the sub Cube , Blue is the upper left part of the image 14×14 Output ( Red arrow sign ), The upper right square is the upper right part of the image ( Green arrow sign ) The corresponding output of , The lower left corner box is the lower left corner of the input layer ( Orange arrow logo ), That's it 14×14 The result of convolution network processing , Again , In the lower right corner, this block is the lower right corner of the convolution network processing input layer 14×14 Area ( Purple arrow logo ) Result .
So the principle of the convolution operation is that we don't need to divide the input image into four subsets , Carry out forward propagation respectively , It is Input them as a picture to convolution network for calculation , The public areas can share a lot of Computing , As we can see here 4 individual 14×14 It's like a box .
Look at a larger sample of pictures , If to one 28×28×3 The picture application of sliding window operation , If you run forward propagation in the same way , Finally get 8×8×4 Result . Because the maximum pooling parameter is 2, It's the size of 2 We use neural network on the original image .
summary :
Cut an area out of the picture , Let's say its size is 14×14, Input it into the convolution network . Continue to enter the next area , The same size 14×14, Repeat , Until an area recognizes the car . But as you can see on the previous page , We can't rely on continuous convolution to identify the car in the picture , such as , We can have a size of 28×28 The entire image of the convolution operation , Get all the predictions at once , If you're lucky , The neural network can identify the location of the car .
The above is the application of sliding window algorithm on convolution layer , It improves the efficiency of the whole algorithm . But this algorithm still has a disadvantage , The position of the bounding box may not be accurate enough .
边栏推荐
- NMOS model selection
- Xiashuo think tank: 125 planet updates reported today (packed with 101 meta universe collections)
- 2021-10-29 [microbiology] qiime2 sample pretreatment form automation script
- The counting tool of combinatorial mathematics -- generating function
- Global digital industry strategy and policy observation in 2021 (China Academy of ICT)
- Program acceleration
- November 16, 2021 [reading notes] - macro genome analysis process
- Simple application of generating function
- December 19, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)
- 期末复习-PHP学习笔记11-PHP-PDO数据库抽象层.
猜你喜欢
342 maps covering exquisite knowledge, one of which is classic and pasted on the wall
STM32 register
Cadence innovus physical implementation series (I) Lab 1 preliminary innovus
Combinatorial mathematics Chapter 1 Notes
C language implementation of chain stack (without leading node)
November 19, 2021 [reading notes] a summary of common problems of sneakemake (Part 2)
2021-10-29 [microbiology] qiime2 sample pretreatment form automation script
At the age of 25, I started to work in the Tiankeng industry with buckets. After going through a lot of hardships to become a programmer, my spring finally came
期末复习-PHP学习笔记3-PHP流程控制语句
Basic knowledge of compiling learning records
随机推荐
Network, network card and IP configuration
National technology n32g45x series about timer timing cycle calculation
November 9, 2020 [wgs/gwas] - whole genome analysis (association analysis) process (Part 2)
C language - student achievement management system
Label the picture below the uniapp picture
Examen final - notes d'apprentissage PHP 6 - traitement des chaînes
Adjacency matrix representation of weighted undirected graph (implemented in C language)
期末複習-PHP學習筆記5-PHP數組
C language implements sequential queue, circular queue and chain queue
Three software installation methods
Network security and data in 2021: collection of new compliance review articles (215 pages)
right four steps of SEIF SLAM
Efga design open source framework fabulous series (I) establishment of development environment
Xiashuo think tank: 42 reports on planet update today (including 23 planning cases)
Solve the linear equation of a specified point and a specified direction
Xiashuo think tank: 50 planet updates reported today (including the global architects Summit Series)
Final review -php learning notes 3-php process control statement
At the age of 25, I started to work in the Tiankeng industry with buckets. After going through a lot of hardships to become a programmer, my spring finally came
Distance from point to line
深度学习——语言模型和序列生成