当前位置:网站首页>Understanding of batchnorm2d() function in pytorch
Understanding of batchnorm2d() function in pytorch
2022-07-27 10:14:00 【Chen Zhuangshi's programming life】
List of articles
1. brief introduction
Machine learning , Before model training , The data need to be normalized , Make it uniformly distributed . In the process of deep neural network training , Usually a workout is a batch, Not all data . Every batch Having different distributions produces internal covarivate shift problem —— In the process of training , The data distribution will change , It brings difficulties to the learning of the next layer network .Batch Normalization Force the data back to the mean value of 0, The variance of 1 On the Zhengtai distribution of , On the one hand, it makes the data distribution consistent , On the other hand, avoid the disappearance of the gradient .
2. Calculation
As shown in the figure :
Above is input data , Its shape=[5, 3, h, w]
Step1: Calculation Mean value under the same channel , As shown in the red block , Both represent the same channel 
Step2: Calculation Variance under the same channel , As shown in the red block , Both represent the same channel 
Step3: Normalize each data under the current channel 
Among them x Represents a specific point , Such as x = X[0][0][0][0][0] This data point .
Step4: Add zoom and translation variables γ \gamma γ and β \beta β, The normalized value is 
among , ϵ \epsilon ϵ Is a set constant , The default is 1e^-5, Its function is to prevent the elimination of 0. γ \gamma γ and β \beta β These two parameters generally do not need our attention ( If , Parameters affine=true, We need to give ).
3. Pytorch Medium nn.BatchNorm2d() Function interpretation
It mainly requires input 4 Parameters :
(1)num_features: The input data is shape It's usually [batch_size, channel, height, width], num_features Among them channel;
(2)eps: A value added to the denominator , The purpose is to calculate the stability of , Default :1e-5;
(3)momentum: An estimation parameter for the mean and variance in the operation process , The default value is 0.1.
(4)affine: When set to true when , Given the coefficient matrix that can be learned γ \gamma γ and β \beta β
4. Code example :
import torch
data = torch.ones(size=(2, 2, 3, 4))
data[0][0][0][0] = 25
print("data = ", data)
print("\n")
print("========================= Use encapsulated BatchNorm2d() Calculation ================================")
BN = torch.nn.BatchNorm2d(num_features=2, eps=0, momentum=0)
BN_data = BN(data)
print("BN_data = ", BN_data)
print("\n")
print("========================= Calculate by yourself ================================")
x = torch.cat((data[0][0], data[1][0]), dim=1) # 1. Splice the same channel ( That is, treat the same channel as a whole )
x_mean = torch.Tensor.mean(x) # 2. Calculate the average value of ownership of the same channel ( That is, the mean value after splicing )
x_var = torch.Tensor.var(x, False) # 3. Calculate the variance of ownership of the same channel ( That is, the variance after splicing )
# 4. Use the first number to find BatchNorm After the value of
bn_first = ((data[0][0][0][0] - x_mean) / ( torch.pow(x_var, 0.5))) * BN.weight[0] + BN.bias[0]
print("bn_first = ", bn_first)
Running results :
(1) The original data 
(2) Use BatchNorm() function

(3) Calculate the normalized value of the batch by yourself 
The data of the two boxes marked red in the figure are completely equal , End of the flower !!!
notes :
There is reference The article
边栏推荐
- Data visualization
- Text processing tool in shell, cut [option parameter] filename Description: the default separator is the built-in variable of tab, awk [option parameter] '/pattern1/{action1}filename and awk
- Interview Essentials: shrimp skin server 15 consecutive questions
- 视觉SLAM十四讲笔记(一):第一讲+第二讲
- 省应急管理厅:广州可争取推广幼儿应急安全宣教经验
- [cloud native • Devops] master the container management tool rancher
- Anchor free detector: centernet
- 中高级试题」:MVCC 实现原理是什么?
- Snowflake vs. Databricks谁更胜一筹?2022年最新战报
- 卸载CUDA11.1
猜你喜欢

Easy to understand! Graphic go synergy principle and Practice

Robotframework+eclispe environment installation

Shell综合应用案例,归档文件、发送消息

视觉SLAM十四讲笔记(一):第一讲+第二讲

卸载CUDA11.1

Vs2019 Community Edition Download tutorial (detailed)

How does data analysis solve business problems? Here is a super detailed introduction
![[scm] source code management - lock of perforce branch](/img/c6/daead474a64a9a3c86dd140c097be0.jpg)
[scm] source code management - lock of perforce branch

Metaaploit-后渗透技知识

食品安全 | 无糖是真的没有糖吗?这些真相要知道
随机推荐
文件上传漏洞绕过方法
数学推理题:张王李赵陈五对夫妇聚会,见面握手
WGAN、WGAN-GP、BigGAN
Introduction to regular expressions of shell, general matching, special characters: ^, $,., * Character range (brackets): [], special characters: \, matching mobile phone number
Mysql database experiment training 5, data query YGGL database query (detailed)
Uninstall cuda11.1
Final examination paper of engineering materials
Case of burr (bulge) notch (depression) detection of circular workpiece
TFlite 的简单使用
历时一年,论文终于被国际顶会接收了
GBase 8a MPP集群扩容实战
达梦 PARTGROUPDEF是自定义的对象吗?
Anaconda安装(非常详细)
hdu5288(OO’s Sequence)
Concurrent thread state transition
Summary of engineering material knowledge points (full)
Example of ICP registration for PCL
食品安全 | 垃圾食品越吃越想吃?这份常见食品热量表请收好
Matlab- draw superimposed ladder diagram and line diagram
QT learning (II) -.Pro file explanation