当前位置：网站首页>Batchnorm2d principle, function and explanation of batchnorm2d function parameters in pytorch

Batchnorm2d principle, function and explanation of batchnorm2d function parameters in pytorch

2022-06-28 16:46:00 【Full stack programmer webmaster】

Hello everyone , I meet you again , I'm your friend, Quan Jun .

BN principle 、 effect ：

Function parameters ：

BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

1.num_features： The general input parameter is batch_sizenum_featuresheight*width, That is, the number of features , This is the input BN The number of channels in the layer ; 2.eps： A value added to the denominator , The purpose is to calculate the stability of , The default is ：1e-5, Avoid denominator as 0; 3.momentum： An estimation parameter for the mean and variance in the operation process （ My understanding is a stability coefficient , Be similar to SGD Medium momentum The coefficient of ）; 4.affine： When set to true when , The coefficient matrix that can be learned will be given gamma and beta Generally speaking pytorch The models in are inherited nn.Module Class , All have a property trainning Specify whether it is training status , The training status will affect whether the parameters of some layers are fixed , such as BN Layer or Dropout layer . Usually use model.train() Specify current model model In training status ,model.eval() Specify that the current model is in test state . meanwhile ,BN Of API There are several parameters to be concerned about , One is affine Specifies whether affine , Another is track_running_stats Specifies whether to track the current batch The statistical characteristics of . These three parameters are also prone to problems ：trainning,affine,track_running_stats. Among them affine Specifies whether affine , That is, whether we need the fourth of the above formula , If affine=False be γ=1,β=0, And can't learn to be updated . It is usually set to affine=True. trainning and track_running_stats,track_running_stats=True It means to track the whole training process batch The statistical characteristics of , Get variance and mean , Instead of just relying on the current input batch The statistical characteristics of . Contrary , If track_running_stats=False Then it just calculates the current input batch The mean and variance in the statistical properties of . When in the reasoning stage , If track_running_stats=False, If at this time batch_size The relatively small , Then its statistical characteristics will deviate greatly from the global statistical characteristics , May lead to bad results . If BatchNorm2d Parameters of track_running_stats Set up False, After loading the pre training, the results of each model test set are different ;track_running_stats Set to True when , The result is the same every time . running_mean and running_var Parameters are based on input batch The statistical properties of , Not exactly “ Study ” Parameters to , But it is very important for the whole calculation .BN Layer. running_mean and running_var The update to forward During operation , Not in optimizer.step() In the , So if you are in training , Even if it is not done manually step(),BN The statistical properties of will also change .

model.train() # In training state 
for data , label in self.dataloader:
    pred =model(data)  # It will be updated here model Medium BN Statistical characteristic parameters ,running_mean,running_var
    loss=self.loss(pred,label)
    # Even if you don't do the following three lines ,BN The statistical characteristic parameters of will also change 
    opt.zero_grad()
    loss.backward()
    opt.step()

This is the time , Use model.eval() Go to the testing phase , Can be fixed running_mean and running_var, Sometimes, if the model is pre trained and then loaded , When rerunning the test data , The results are different , There is a little loss of performance , This time is basically training and track_running_stats Wrong settings . If two models are used for joint training , To make convergence easier to control , First, pre train the model model_A, also model_A There are also several BN layer , In the future, we need to model_A As a inference Reasoning model and model_B Joint training , Hope at this time model_A Medium BN Statistical characteristic quantity of running_mean and running_var No random changes , So we need to put model_A.eval() Set to test model , Otherwise, in the trainning In mode , Even if you don't update the parameters of the model , Its BN Will change , This will lead to different results than expected .

Publisher ： Full stack programmer stack length , Reprint please indicate the source ：https://javaforall.cn/132951.html Link to the original text ：https://javaforall.cn

原网站

版权声明
本文为[Full stack programmer webmaster]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/179/202206281446426921.html

当前位置：网站首页>Batchnorm2d principle, function and explanation of batchnorm2d function parameters in pytorch

Batchnorm2d principle, function and explanation of batchnorm2d function parameters in pytorch

BN principle 、 effect ：

Function parameters ：

边栏推荐

猜你喜欢

随机推荐