当前位置:网站首页>nn. Exploration and experiment of batchnorm2d principle
nn. Exploration and experiment of batchnorm2d principle
2022-07-04 11:42:00 【Andy Dennis】
Preface
In the morning, I was asked by my classmates for batch norm Principle , Because I only used torch.nn.BatchNorm2d The stage of , Just know it's right channel Dimension can be normalized in batches , But it is not clear about the specific implementation , So I did the experiment . First look at it. torch Example , Then write a handwritten version of the calculation code .
bn Every element after y i y_i yi It can be simply written as
y i = x i − x ˉ σ 2 + ϵ y_i = \frac{x_i-\bar{x}}{\sqrt{\sigma^{2}} + \epsilon} yi=σ2+ϵxi−xˉ
among , x i x_i xi Is a previous element , x ˉ \bar{x} xˉ yes channel The mean on the dimension , σ \sigma σ yes channel The standard deviation of dimensions , ϵ \epsilon ϵ Is a coefficient factor , ( A bit like Laplacian smoothing , Prevent denominator from being 0?but I’m not sure), The default is 1 0 − 5 10^{-5} 10−5, A very small number .
torch The way
# encoding:utf-8
import torch
import torch.nn as nn
input = torch.tensor([[[[1, 1],
[1, 2]],
[[-1, 1],
[0, 1]]],
[[[0, -1],
[2, 2]],
[[0, -1],
[3, 1]]]]).float()
# num_features - num_features from an expected input of size:batch_size*num_features*height*width
# eps:default:1e-5 ( In the formula is the value added to the denominator for numerical stability )
# momentum: Momentum parameter , be used for running_mean and running_var Calculated value ,default:0.1
# affine The parameter is set to True Express weight and bias Will be used , However, there is no back propagation in this example , So it doesn't matter whether you add it or not
m = nn.BatchNorm2d(2, affine=False)
output = m(input)
# print('input:\n', input)
print('m.weight:\n', m.weight)
print('m.bias:', m.bias)
print('output:\n', output)
print('output:', output.size())
The point here is , It doesn't matter affine Set to True, In this example, the results are the same , We can see by looking at the weight , weight whole 1, Then multiply and return to the original number , bias by 0, Then it will be the original number after adding . There is no back propagation , So the weight will not change .
hand writing
We have nothing but channel Outside the channel , Other dimensions are flattened , Then calculate the mean and variance , Use the calculated mean and variance to operate the original data .
# encoding:utf-8
from matplotlib.pyplot import axis
import torch
import torch.nn as nn
input = torch.tensor([[[[1, 1],
[1, 2]],
[[-1, 1],
[0, 1]]],
[[[0, -1],
[2, 2]],
[[0, -1],
[3, 1]]]]).float()
# [B, C, H, W]
N, c_num, h, w = input.shape
print(input.shape)
x = input.transpose(0, 1).flatten(1)
# print(x)
c_mean = x.mean(dim=1)
print('c_mean:', c_mean)
c_std = torch.tensor(x.numpy().std(axis=1)) # Standard deviation formula , torch N-1, numpy N
print('c_std^2:', c_std ** 2)
# # Expand dimensions , And copy the elements , Convenient for the following batch operation
c_mean = c_mean.reshape(1, 2, 1, 1).repeat(N, 1, h, w)
c_std = c_std.reshape(1, 2, 1, 1).repeat(N, 1, h, w)
# # print(c_mean)
# # print(c_std)
eps = 1e-5
output = (input - c_mean) / (c_std ** 2 + eps) ** 0.5
print(output)
There's a little bit of caution here , pytorch and numpy The formula for calculating the standard deviation of is different , That's why I changed my code to numpy Do it again . But it's reasonable pytorch It should be possible to pass a parameter or something to change the calculation method .
numpy:
s t d = 1 N ∑ i = 1 N ( x i − x ˉ ) 2 std = \sqrt{\frac{1}{N}\sum^{N}_{i=1}(x_i-\bar{x})^2 } std=N1i=1∑N(xi−xˉ)2
torch:
s t d = 1 N − 1 ∑ i = 1 N ( x i − x ˉ ) 2 std = \sqrt{\frac{1}{N-1}\sum^{N}_{i=1}(x_i-\bar{x})^2 } std=N−11i=1∑N(xi−xˉ)2
边栏推荐
- IO stream ----- open
- Number and math classes
- re. Sub() usage
- netstat
- Polymorphic system summary
- Elevator dispatching (pairing project) ②
- Heartbeat启动后无反应
- QQ group administrators
- Decrypt the advantages of low code and unlock efficient application development
- AI should take code agriculture? Deepmind offers a programming version of "Alpha dog" alphacode that surpasses nearly half of programmers!
猜你喜欢
Failed to configure a DataSource: ‘url‘ attribute is not specified... Bug solution
Introduction to Lichuang EDA
Entitas learning [3] multi context system
(2021-08-20) web crawler learning 2
How to judge the advantages and disadvantages of low code products in the market?
2021 annual summary - it seems that I have done everything except studying hard
2020 Summary - Magic year, magic me
Ultimate bug finding method - two points
[Yunju entrepreneurial foundation notes] Chapter II entrepreneur test 16
Oracle11g | getting started with database. It's enough to read this 10000 word analysis
随机推荐
[Yunju entrepreneurial foundation notes] Chapter II entrepreneur test 5
2020 Summary - Magic year, magic me
Configure SSH key to realize login free
Understanding of object
Decrypt the advantages of low code and unlock efficient application development
[Yunju entrepreneurial foundation notes] Chapter II entrepreneur test 6
Heartbeat error attempted replay attack
Application of slice
Number and math classes
Local MySQL forgot the password modification method (Windows)
Summary of collection: (to be updated)
(August 9, 2021) example exercise of air quality index calculation (I)
Digital simulation beauty match preparation -matlab basic operation No. 6
Introduction to Lichuang EDA
Fundamentals of software testing
Swagger and OpenAPI
CSDN documentation specification
regular expression
Lvs+kept realizes four layers of load and high availability
MBG combat zero basis