当前位置：网站首页>LRN local response normalization

LRN local response normalization

2022-06-30 18:08:00 【*Yuanzai】

This technique is mainly a technical method to improve accuracy in deep learning training . among caffe、tensorflow Waiting inside is a very common method , It is different from the activation function ,LRN It's usually activated 、 A method of processing after pooling .
AlexNet take LeNet To carry forward the thought of , hold CNN The basic principle is applied to a very deep and wide network .AlexNet The main new technologies used are as follows .

Successful use of ReLU As CNN The activation function of , And verify its effect in the deeper network than Sigmoid, Successfully solved Sigmoid The problem of gradient dispersion in deep network . although ReLU The activation function was proposed a long time ago , But until AlexNet The emergence of the will carry it forward .
Use during training Dropout Ignore some neurons at random , To avoid over fitting the model .Dropout Although there is a separate paper on , however AlexNet Put it to use , Its effect has been proved by practice . stay AlexNet It is mainly used in the last several full connection layers Dropout.
stay CNN Maximum pooling using overlap in . before CNN Average pooling is commonly used in ,AlexNet All use maximum pooling , Avoid the blurring effect of average pooling . also AlexNet The step size is smaller than that of the pool core , In this way, there will be overlap and coverage between the outputs of the pooling layer , Enhance the richness of features .
Put forward LRN layer , Create a competitive mechanism for the activity of local neurons , Make the value with larger response relatively larger , And inhibit other neurons with smaller feedback , It enhances the generalization ability of the model .

The function explanation is quoted from tensorflow Official documents
https://www.tensorflow.org/api_docs/python/tf/nn/local_response_normalization
The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,
sqr_sum[a, b, c, d] =
sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2)
output = input / (bias + alpha * sqr_sum) ** beta

Insert picture description here
The formula looks complicated , But it is very simple to understand .i It means the first one i The nuclei are in position （x,y） Use the activation function ReLU Output after ,n It's adjacent in the same position kernal map Number of ,N yes kernal Total of . Parameters K,n,alpha,belta It's all super parameters , General Settings k=2,n=5,aloha=1*e-4,beta=0.75.
Insert picture description here

In this formula a Represents a convolution layer （ Including convolution operation and pooling operation ） The output after , The structure of the output result is a four-dimensional array [batch,height,width,chnnel], Here is a simple explanation ,batch Batch times ( Each batch is a picture ),height Is the height of the picture ,width Is the width of the picture ,channel The number of channels can be understood as the number of neurons output from a certain picture in a batch of pictures after convolution operation ( Or it can be understood as the depth of the processed picture ).
$a^i _{x,y}$ Represents a position in this output structure [a,b,c,d], It can be understood as a point at a certain height and a certain width under a certain channel in a certain diagram , That is to say a The number of the picture d The height under each channel is b Width is c The point of . In the thesis formula N Indicates the number of channels (channel).
a,n/2,k,α,β Respectively represent in the function input,depth_radius,bias,alpha,beta, among n/2,k,α,β It's all custom , Pay special attention to ∑ The stacking direction is along the channel direction , That is, the sum of the squares of each point value is along a No 3 dimension channel The direction of the , That is, the front of a point in the same direction n/2 Channels （ The minimum is the... Th 0 Channels ） And after n/2 Channels （ The maximum is... Th d-1 Channels ） The sum of the squares of the points ( common n+1 A little bit ). The English annotation of the function also explains input Think of it as d individual 3 A matrix of dimensions , To put it bluntly is to put input The number of channels is taken as 3 Number of dimensional matrices , The stacking direction is also in the channel direction .

import tensorflow as tf
import numpy as np
x = np.array([i for i in range(1,33)]).reshape([2,2,2,4])
y = tf.nn.lrn(input=x,depth_radius=2,bias=0,alpha=1,beta=1)
 
with tf.Session() as sess:
    print(x)
    print('#############')
    print(y.eval())

Insert picture description here
Interpretation of the results ：
Pay attention here , If you change this matrix into a picture, the format is like this
Then according to the above description, we can give an example, such as 26 Corresponding output results 0.00923952 The calculation is as follows $26/(0+1*(25^2+26^2+27^2+28^2))^1$

原网站

版权声明
本文为[*Yuanzai]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202160456509046.html

当前位置：网站首页>LRN local response normalization

LRN local response normalization

边栏推荐

猜你喜欢

随机推荐