当前位置:网站首页>LRN local response normalization
LRN local response normalization
2022-06-30 18:08:00 【*Yuanzai】
This technique is mainly a technical method to improve accuracy in deep learning training . among caffe、tensorflow Waiting inside is a very common method , It is different from the activation function ,LRN It's usually activated 、 A method of processing after pooling .
AlexNet take LeNet To carry forward the thought of , hold CNN The basic principle is applied to a very deep and wide network .AlexNet The main new technologies used are as follows .
Successful use of ReLU As CNN The activation function of , And verify its effect in the deeper network than Sigmoid, Successfully solved Sigmoid The problem of gradient dispersion in deep network . although ReLU The activation function was proposed a long time ago , But until AlexNet The emergence of the will carry it forward .
Use during training Dropout Ignore some neurons at random , To avoid over fitting the model .Dropout Although there is a separate paper on , however AlexNet Put it to use , Its effect has been proved by practice . stay AlexNet It is mainly used in the last several full connection layers Dropout.
stay CNN Maximum pooling using overlap in . before CNN Average pooling is commonly used in ,AlexNet All use maximum pooling , Avoid the blurring effect of average pooling . also AlexNet The step size is smaller than that of the pool core , In this way, there will be overlap and coverage between the outputs of the pooling layer , Enhance the richness of features .
Put forward LRN layer , Create a competitive mechanism for the activity of local neurons , Make the value with larger response relatively larger , And inhibit other neurons with smaller feedback , It enhances the generalization ability of the model .
The function explanation is quoted from tensorflow Official documents
https://www.tensorflow.org/api_docs/python/tf/nn/local_response_normalization
The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,
sqr_sum[a, b, c, d] =
sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2)
output = input / (bias + alpha * sqr_sum) ** beta

The formula looks complicated , But it is very simple to understand .i It means the first one i The nuclei are in position (x,y) Use the activation function ReLU Output after ,n It's adjacent in the same position kernal map Number of ,N yes kernal Total of . Parameters K,n,alpha,belta It's all super parameters , General Settings k=2,n=5,aloha=1*e-4,beta=0.75.


In this formula a Represents a convolution layer ( Including convolution operation and pooling operation ) The output after , The structure of the output result is a four-dimensional array [batch,height,width,chnnel], Here is a simple explanation ,batch Batch times ( Each batch is a picture ),height Is the height of the picture ,width Is the width of the picture ,channel The number of channels can be understood as the number of neurons output from a certain picture in a batch of pictures after convolution operation ( Or it can be understood as the depth of the processed picture ).
a x , y i a^i _{x,y} ax,yi Represents a position in this output structure [a,b,c,d], It can be understood as a point at a certain height and a certain width under a certain channel in a certain diagram , That is to say a The number of the picture d The height under each channel is b Width is c The point of . In the thesis formula N Indicates the number of channels (channel).
a,n/2,k,α,β Respectively represent in the function input,depth_radius,bias,alpha,beta, among n/2,k,α,β It's all custom , Pay special attention to ∑ The stacking direction is along the channel direction , That is, the sum of the squares of each point value is along a No 3 dimension channel The direction of the , That is, the front of a point in the same direction n/2 Channels ( The minimum is the... Th 0 Channels ) And after n/2 Channels ( The maximum is... Th d-1 Channels ) The sum of the squares of the points ( common n+1 A little bit ). The English annotation of the function also explains input Think of it as d individual 3 A matrix of dimensions , To put it bluntly is to put input The number of channels is taken as 3 Number of dimensional matrices , The stacking direction is also in the channel direction .
import tensorflow as tf
import numpy as np
x = np.array([i for i in range(1,33)]).reshape([2,2,2,4])
y = tf.nn.lrn(input=x,depth_radius=2,bias=0,alpha=1,beta=1)
with tf.Session() as sess:
print(x)
print('#############')
print(y.eval())

Interpretation of the results :
Pay attention here , If you change this matrix into a picture, the format is like this 
Then according to the above description, we can give an example, such as 26 Corresponding output results 0.00923952 The calculation is as follows 26 / ( 0 + 1 ∗ ( 2 5 2 + 2 6 2 + 2 7 2 + 2 8 2 ) ) 1 26/(0+1*(25^2+26^2+27^2+28^2))^1 26/(0+1∗(252+262+272+282))1
边栏推荐
- What did Tongji and Ali study in the CVPR 2022 best student thesis award? This is an interpretation of yizuo
- 【网易云信】播放demo构建:无法将参数 1 从“AsyncModalRunner *”转换为“std::nullptr_t”**
- Elastic 8.0: opening a new era of speed, scale, relevance and simplicity
- 5g has been in business for three years. Where will innovation go in the future?
- 阿里云ECS导入本地,解决部署的问题
- [sword finger offer] sword finger offer 53 - ii Missing numbers from 0 to n-1
- Development details of NFT casting trading platform
- NFT铸造交易平台开发详情
- Design and principle of tubes responsive data system
- Compile and generate busybox file system
猜你喜欢

Animesr: learnable degradation operator and new real world animation VSR dataset

Deep understanding of JVM (II) - memory structure (II)
![Ten thousand volumes - list sorting [01]](/img/d4/124101b919a4d8163a32fc0f158efa.png)
Ten thousand volumes - list sorting [01]

Solution: STM32 failed to parse data using cjson

Six photos vous montrent pourquoi TCP serre la main trois fois?

Share 5 commonly used feature selection methods, and you must see them when you get started with machine learning!!!

Flutter custom component

广电5G正式启航,黄金频段将如何应用引关注

Rainbow Brackets 插件的快捷键

Small Tools(3) 集成Knife4j3.0.3接口文档
随机推荐
现在玩期货需要注意什么,在哪里开户比较安全,我第一次接触
Compile and generate busybox file system
Elastic 8.0: opening a new era of speed, scale, relevance and simplicity
【网易云信】播放demo构建:无法将参数 1 从“AsyncModalRunner *”转换为“std::nullptr_t”**
New skill: accelerate node through code cache JS startup
【机器学习】K-means聚类分析
[Netease Yunxin] playback demo build: unable to convert parameter 1 from "asyncmodalrunner *" to "std:: nullptr\u T"**
Apache 解析漏洞(CVE-2017-15715)_漏洞复现
Deep understanding of JVM (I) - memory structure (I)
[PROJECT] Xiaomao school (IX)
Optimization of interface display for general kernel upgrade of mobo video management system v3.5.0
MSF后渗透总结
Flutter custom component
Zero foundation can also be an apple blockbuster! This free tool can help you render, make special effects and show silky slides
Fragmentary knowledge points of MySQL
【架构】1366- 如何画出一张优秀的架构图
[sword finger offer] 52 The first common node of two linked lists
Redis (II) -- persistence
Hyper-V: enable SR-IOV in virtual network
Solution: STM32 failed to parse data using cjson