当前位置:网站首页>CNN convolution neural network learning process (weight update)
CNN convolution neural network learning process (weight update)
2022-07-28 20:22:00 【LifeBackwards】
Convolutional neural network adopts BP Algorithm learning network parameters ,BP The algorithm is based on the gradient descent principle to update the network parameters . In a convolutional neural network , The parameters to be optimized include convolution kernel parameters k、 Lower sampling layer weight β、 Full connection layer network weights w And offset of each layer b. We take the mean square error between the expectation and output of convolutional neural network as the cost function , The aim is to minimize the cost function , So that the actual neural network output can accurately predict the input , The cost function is as follows :
among ,N For the number of training samples ,
It's No n Real category labels of training samples ,
It's No n The prediction category labels obtained by convolution neural network learning of training samples .
Convolution layer gradient calculation
Generally speaking, every convolution l There will be a lower sampling layer behind l+1, According to the back-propagation method , To get convolution layer l The gradient of each neural node corresponding to the weight , You need to find l The residual of each neural node of the layer δ, That is, the partial derivative of the cost function for each neural node . In order to find the residual , You need to sum up the residuals of the nodes at the next level to get
, Then multiply by the weight corresponding to these connections W, Then multiply by l Input of this neural node in the layer u The activation function of f Partial derivative of , It can be obtained. l Residual of each neural node in layer
. Due to the existence of down sampling , The residual of the neural node of the sampling layer corresponds to the region of the sampling window size in the output characteristic graph of the previous layer , therefore l Each neural node of each feature graph in the layer is only associated with l+1 The layer is connected with a neural node in the corresponding characteristic graph , Calculate the residual of each pixel in the feature map to get the residual map . For calculation l Layer residuals , The residuals corresponding to the upper sampling layer and the lower sampling layer are required , And then l The partial derivative of the excitation function of the characteristic graph of the layer and the residual graph obtained by up sampling are multiplied item by item . Finally, multiply the above results by the weight β, Thus, the convolution layer is obtained l Residual of . The residual of the characteristic graph in the convolution layer is calculated by the following formula :
among , Symbol ο Represents the multiplication of each element ,up(·) Indicates an upsampling operation . If the sampling factor of down sampling is n, that up(·) The operation is to copy each element horizontally and vertically n Time . function up(·) It can be used Kronecker The product of
To achieve . According to this residual figure , The gradient of the offset and convolution kernel corresponding to the characteristic graph can be calculated :

among ,(u, v) Represents the coordinates of pixels in the feature map ,
To calculate
when , And
Multiplied item by item
Elements .
Gradient calculation of lower sampling layer
The parameters involved in the down sampling forward process have a multiplicative factor corresponding to each characteristic graph β And an offset b. In order to find the lower sampling layer l Gradient of , You need to find the corresponding area between the residual graph of the current layer and the residual graph of the next layer , Then the residual is propagated back . Besides , It also needs to be multiplied by the weight between the input characteristic map and the output characteristic map , This weight is the parameter of convolution kernel . The formula is as follows :
![]()
Multiplicative factor β And offset b The gradient calculation formula of is as follows :

Gradient calculation of full connection layer
The calculation of the full connection layer is similar to that of the lower sampling layer , The residual calculation formula is as follows :
![]()
The partial derivative of the cost function to the bias is as follows :
![]()
The gradient calculation formula of the weight of the whole connection layer is :

边栏推荐
- [C language] advanced pointer exercise 1
- Solve the kangaroo crossing problem (DP)
- 七种轮询介绍(后附实践链接)
- Token verification program index.php when configuring wechat official account server
- WUST-CTF2021-re校赛wp
- [C language] header file of complex number four operations and complex number operations
- Digital filter design matlab
- C language - pointer
- New fruit naming (DP is similar to the longest common subsequence)
- Commands related to obtaining administrator permissions
猜你喜欢

C language data 3 (1)
![[C language] string reverse order implementation (recursion and iteration)](/img/c3/02d0a72f6026df8a67669293e55ef2.png)
[C language] string reverse order implementation (recursion and iteration)

Raspberrypico analytic PWM
![[C language] print pattern summary](/img/48/d8ff17453e810fcd9269f56eda4d47.png)
[C language] print pattern summary

为什么客户支持对SaaS公司很重要?
![最大交换[贪心思想&单调栈实现]](/img/ad/8f0914f23648f37e1d1ce69086fd2e.png)
最大交换[贪心思想&单调栈实现]

C language - data storage

3、 Are formal and actual parameters in a programming language variables?

Quick sort template

83. (cesium home) how the cesium example works
随机推荐
Why is customer support important to SaaS?
File lookup and file permissions
Basic knowledge of C language
[C language] shutdown game [loop and switch statement]
Implementation of strstr in C language
Multi-Modal Knowledge Graph Construction and Application: A Survey
6. Functions of C language, why functions are needed, how functions are defined, and the classification of functions
7. Functions of C language, function definitions and the order of function calls, how to declare functions, prime examples, formal parameters and arguments, and how to write a function well
MySQL command statement (personal summary)
私有化部署的即时通讯平台,为企业移动业务安全保驾护航
Windows系统下Mysql数据库定时备份
Linxu 【权限,粘滞位】
C语言简单实例 1
C language data 3 (1)
字符设备驱动结构
Merge sort template
数据挖掘(数据预处理篇)--笔记
进制及数的表示 2
[C language] scanf format input and modifier summary
Store and guarantee rancher data based on Minio objects