当前位置:网站首页>Batchnorm2d principle, function and explanation of batchnorm2d function parameters in pytorch
Batchnorm2d principle, function and explanation of batchnorm2d function parameters in pytorch
2022-06-28 16:46:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
BN principle 、 effect :
Function parameters :
BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)1.num_features: The general input parameter is batch_sizenum_featuresheight*width, That is, the number of features , This is the input BN The number of channels in the layer ; 2.eps: A value added to the denominator , The purpose is to calculate the stability of , The default is :1e-5, Avoid denominator as 0; 3.momentum: An estimation parameter for the mean and variance in the operation process ( My understanding is a stability coefficient , Be similar to SGD Medium momentum The coefficient of ); 4.affine: When set to true when , The coefficient matrix that can be learned will be given gamma and beta Generally speaking pytorch The models in are inherited nn.Module Class , All have a property trainning Specify whether it is training status , The training status will affect whether the parameters of some layers are fixed , such as BN Layer or Dropout layer . Usually use model.train() Specify current model model In training status ,model.eval() Specify that the current model is in test state . meanwhile ,BN Of API There are several parameters to be concerned about , One is affine Specifies whether affine , Another is track_running_stats Specifies whether to track the current batch The statistical characteristics of . These three parameters are also prone to problems :trainning,affine,track_running_stats. Among them affine Specifies whether affine , That is, whether we need the fourth of the above formula , If affine=False be γ=1,β=0, And can't learn to be updated . It is usually set to affine=True. trainning and track_running_stats,track_running_stats=True It means to track the whole training process batch The statistical characteristics of , Get variance and mean , Instead of just relying on the current input batch The statistical characteristics of . Contrary , If track_running_stats=False Then it just calculates the current input batch The mean and variance in the statistical properties of . When in the reasoning stage , If track_running_stats=False, If at this time batch_size The relatively small , Then its statistical characteristics will deviate greatly from the global statistical characteristics , May lead to bad results . If BatchNorm2d Parameters of track_running_stats Set up False, After loading the pre training, the results of each model test set are different ;track_running_stats Set to True when , The result is the same every time . running_mean and running_var Parameters are based on input batch The statistical properties of , Not exactly “ Study ” Parameters to , But it is very important for the whole calculation .BN Layer. running_mean and running_var The update to forward During operation , Not in optimizer.step() In the , So if you are in training , Even if it is not done manually step(),BN The statistical properties of will also change .
model.train() # In training state
for data , label in self.dataloader:
pred =model(data) # It will be updated here model Medium BN Statistical characteristic parameters ,running_mean,running_var
loss=self.loss(pred,label)
# Even if you don't do the following three lines ,BN The statistical characteristic parameters of will also change
opt.zero_grad()
loss.backward()
opt.step()This is the time , Use model.eval() Go to the testing phase , Can be fixed running_mean and running_var, Sometimes, if the model is pre trained and then loaded , When rerunning the test data , The results are different , There is a little loss of performance , This time is basically training and track_running_stats Wrong settings . If two models are used for joint training , To make convergence easier to control , First, pre train the model model_A, also model_A There are also several BN layer , In the future, we need to model_A As a inference Reasoning model and model_B Joint training , Hope at this time model_A Medium BN Statistical characteristic quantity of running_mean and running_var No random changes , So we need to put model_A.eval() Set to test model , Otherwise, in the trainning In mode , Even if you don't update the parameters of the model , Its BN Will change , This will lead to different results than expected .
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/132951.html Link to the original text :https://javaforall.cn
边栏推荐
- 【Hot100】3. Longest substring without duplicate characters
- DPDK 20.11编译安装运行程序
- 如何登录到你的 WordPress 管理仪表板
- Cardinality sorting - common sorting method (2/8)
- 数字藏品热潮之下,你必须知道的那些事儿
- Csp-j1 csp-s1 preliminary training plan and learning points in summer and September 2022
- 6 - 字典
- What you have to know under the digital collection boom
- How can the sports app keep the end-to-side background alive to make the sports record more complete?
- MySQL auto - Connect Query recommended favorites
猜你喜欢

FS2K人脸素描属性识别

leetcode:22. bracket-generating
![[tcapulusdb knowledge base] Introduction to tcapulusdb restrictions](/img/d3/27f09f7f5ab8e27d1ab87a35a9c0f3.png)
[tcapulusdb knowledge base] Introduction to tcapulusdb restrictions

使用 Open Connector 进行 HubSpot 和 SAP 系统的集成工作

After the first failure, AMEC rushed to the Hong Kong stock exchange for the second time, and the financial principal changed frequently
![[tcapulusdb knowledge base] tcapulusdb technical support introduction](/img/ae/9295c8ae642cde632d06966c28d342.png)
[tcapulusdb knowledge base] tcapulusdb technical support introduction

【TcaplusDB】祝大家端午安康!

云上竞技,360°见证速度与激情

岛屿类问题通用解法与DFS框架

Cross cluster deployment of helm applications using karmada
随机推荐
【力扣】35. 搜索插入位置
抓取手机端变体组合思路设想
小新黑苹果声卡ID注入
[golang] how to install iris
How to query all the data in a table in the database?
Zuckerberg to investors: don't expect anything from metauniverse
AI落地的新范式,就“藏”在下一场软件基础设施的重大升级里
Please ask me, the queries written in my database account for 99%. Is it better to use pay as you go mode or reservation mode?
昨日元宇宙| 沃尔玛成立探索元宇宙和Web3的创新部门,Dior发布元宇宙展览
Mysql自連接查詢「建議收藏」
【Hot100】1. 两数之和
NOIP普及组2006-2018初赛 2019 CSP-J1 2020 CSP-J1 完善程序题
Noip popularization group 2006-2018 preliminary round 2019 csp-j1 2020 csp-j1 improvement program
The new paradigm of AI landing is "hidden" in the next major upgrade of software infrastructure
运维-- 统一网关非常必要
【尚硅谷与腾讯云官方合作】硅谷课堂项目视频发布
【世界海洋日】TcaplusDB号召你一同保护海洋生物多样性
GCC efficient graph revolution for joint node representationlearning and clustering
Design details of the full stack CRM development tool webclient UI workbench
js中订阅发布模式bus