当前位置:网站首页>Understanding of batchnorm2d() function in pytorch
Understanding of batchnorm2d() function in pytorch
2022-07-27 10:14:00 【Chen Zhuangshi's programming life】
List of articles
1. brief introduction
Machine learning , Before model training , The data need to be normalized , Make it uniformly distributed . In the process of deep neural network training , Usually a workout is a batch, Not all data . Every batch Having different distributions produces internal covarivate shift problem —— In the process of training , The data distribution will change , It brings difficulties to the learning of the next layer network .Batch Normalization Force the data back to the mean value of 0, The variance of 1 On the Zhengtai distribution of , On the one hand, it makes the data distribution consistent , On the other hand, avoid the disappearance of the gradient .
2. Calculation
As shown in the figure :
Above is input data , Its shape=[5, 3, h, w]
Step1: Calculation Mean value under the same channel , As shown in the red block , Both represent the same channel 
Step2: Calculation Variance under the same channel , As shown in the red block , Both represent the same channel 
Step3: Normalize each data under the current channel 
Among them x Represents a specific point , Such as x = X[0][0][0][0][0] This data point .
Step4: Add zoom and translation variables γ \gamma γ and β \beta β, The normalized value is 
among , ϵ \epsilon ϵ Is a set constant , The default is 1e^-5, Its function is to prevent the elimination of 0. γ \gamma γ and β \beta β These two parameters generally do not need our attention ( If , Parameters affine=true, We need to give ).
3. Pytorch Medium nn.BatchNorm2d() Function interpretation
It mainly requires input 4 Parameters :
(1)num_features: The input data is shape It's usually [batch_size, channel, height, width], num_features Among them channel;
(2)eps: A value added to the denominator , The purpose is to calculate the stability of , Default :1e-5;
(3)momentum: An estimation parameter for the mean and variance in the operation process , The default value is 0.1.
(4)affine: When set to true when , Given the coefficient matrix that can be learned γ \gamma γ and β \beta β
4. Code example :
import torch
data = torch.ones(size=(2, 2, 3, 4))
data[0][0][0][0] = 25
print("data = ", data)
print("\n")
print("========================= Use encapsulated BatchNorm2d() Calculation ================================")
BN = torch.nn.BatchNorm2d(num_features=2, eps=0, momentum=0)
BN_data = BN(data)
print("BN_data = ", BN_data)
print("\n")
print("========================= Calculate by yourself ================================")
x = torch.cat((data[0][0], data[1][0]), dim=1) # 1. Splice the same channel ( That is, treat the same channel as a whole )
x_mean = torch.Tensor.mean(x) # 2. Calculate the average value of ownership of the same channel ( That is, the mean value after splicing )
x_var = torch.Tensor.var(x, False) # 3. Calculate the variance of ownership of the same channel ( That is, the variance after splicing )
# 4. Use the first number to find BatchNorm After the value of
bn_first = ((data[0][0][0][0] - x_mean) / ( torch.pow(x_var, 0.5))) * BN.weight[0] + BN.bias[0]
print("bn_first = ", bn_first)
Running results :
(1) The original data 
(2) Use BatchNorm() function

(3) Calculate the normalized value of the batch by yourself 
The data of the two boxes marked red in the figure are completely equal , End of the flower !!!
notes :
There is reference The article
边栏推荐
- C # set different text watermarks for each page of word
- Decision tree principle and case application - Titanic survival prediction
- 在Centos 7安装Mysql 5.7.27后无法启动?(语言-bash)
- 二叉树习题总结
- Stylegan paper notes + modify code to try 3D point cloud generation
- PCL的ICP配准示例
- Looking for a job for 4 months, interviewing 15 companies and getting 3 offers
- Shell operator, $((expression)) "or" $[expression], expr method, condition judgment, test condition, [condition], comparison between two integers, judgment according to file permission, judgment accor
- Matlab-绘制叠加阶梯图和线图
- Is Damon partgroupdef a custom object?
猜你喜欢

华为交换机双上行组网Smart-link配置指南

备战金九银十Android面试准备(含面试全流程,面试准备工作面试题和资料等)

Why is redis so fast? Redis threading model and redis multithreading

ACL2021最佳论文出炉,来自字节跳动

open3d库的安装,conda常用指令,导入open3d时报这个错误Solving environment: failed with initial frozen solve. Retrying w
![Text processing tool in shell, cut [option parameter] filename Description: the default separator is the built-in variable of tab, awk [option parameter] '/pattern1/{action1}filename and awk](/img/ed/941276a15d1c4ab67d397fb3286022.png)
Text processing tool in shell, cut [option parameter] filename Description: the default separator is the built-in variable of tab, awk [option parameter] '/pattern1/{action1}filename and awk

Snowflake vs. databricks who is better? The latest war report in 2022

Anaconda installation (very detailed)

What happens if the MySQL disk is full? I really met you!

邮件服务器
随机推荐
LeetCode.1260. 二维网格迁移____原地暴力 / 降维+循环数组直接定位
Shell integrated application cases, archiving files, sending messages
去 OPPO 面试,被问麻了
Visual slam lecture notes (I): Lecture 1 + Lecture 2
Engineering survey simulation volume a
Review summary of engineering surveying examination
备战金九银十Android面试准备(含面试全流程,面试准备工作面试题和资料等)
Anaconda installation (very detailed)
When I went to oppo for an interview, I got numb
Metaaploit-后渗透技知识
Brush the title "sword finger offer" day03
How to create a.Net image with diagnostic tools
Failure of CUDA installation nsight visual studio edition failed
PCL的ICP配准示例
直播倒计时 3 天|SOFAChannel#29 基于 P2P 的文件和镜像加速系统 Dragonfly
Shell中的文本处理工具、cut [选项参数] filename 说明:默认分隔符是制表符、awk [选项参数] ‘/pattern1/{action1}filename 、awk 的内置变量
Case of burr (bulge) notch (depression) detection of circular workpiece
Snowflake vs. Databricks谁更胜一筹?2022年最新战报
使用 LSM-Tree 思想基于.NET 6.0 C# 写个 KV 数据库(案例版)
Gbase 8A MPP cluster capacity expansion practice