当前位置:网站首页>Why should model.eval() be added to the pytorch test?
Why should model.eval() be added to the pytorch test?
2022-08-01 15:00:00 【passion-ma】
Many machine learning tutorials have mentioned that when using pytorch for training and testing, you must specify eval for the instantiated model, so why do you need to set model.eval() when pytorch is testing?What is the function of model.eval()?The next article tells you.
When using PyTorch for training and testing, be sure to specify train/eval for the instantiated model. When eval(), the framework will automatically fix BN and DropOut.It will not take the average, but use the trained value, otherwise, once the batch_size of the test is too small, it will easily be caused by the BN layer to generate a large color distortion of the image!!!!!!
The difference between model.eval() and with torch.no_grad()
When validating in PyTorch, model.eval() is used to switch to test mode, in which mode,
Mainly used to notify the dropout layer and batchnorm layer to switch between train and val modes
In the train mode, the dropout network layer will set the probability of retaining the activation unit according to the set parameter p (retention probability = p); the batchnorm layer will continue to calculate and update the parameters such as the mean and var of the data.
In val mode, the dropout layer will allow all activation units to pass through, while the batchnorm layer will stop computing and update mean and var, directly using what has been learned in the training phaseout the mean and var values.
This mode does not affect the gradient calculation behavior of each layer, that is, the gradient calculation and storage are the same as the training mode, but no backprobagation is performed
And with torch.no_grad() is mainly used to stop the work of the autograd module to accelerate and save video memory. The specific behavior is to stop the gradient calculation, thereby saving GPU computing power and video memory, but it does notAffects the behavior of dropout and batchnorm layers.
I don't understand why there is a difference between model.eval() and model.train() in the training and testing functions. After reviewing, I will make the following arrangements
In general, our training process is as follows:
1. Train after getting the data. During the training process, use
model.train(): Tell our network that this stage is used for training and can update parameters.
2. Prediction after training is completed. During the prediction process, use
model.eval() : Tell our network that this stage is used for testing, so the parameters of the model are not updated in this stage.
边栏推荐
- qt 通用ui
- VIM实用指南(3)复制,粘贴 ,删除,撤销,重做指令速记
- Inflation continues, Kenya's food security a concern
- MySQL中字符串比较大小(日期字符串比较问题)
- 网站2D看板娘收集的可用的模型
- 大佬们,datax同步数据,同步过程中要新增一个uuid,请问column 怎么写pgsql,uu
- SQL query data and sorting
- 产品力无提升的雷克萨斯新款ES ,为何敢于涨价?
- Stock Strategy 02 | Technology Timing + Industry Factors + Market Value Rotation
- kubelet节点压力驱逐
猜你喜欢

反序列化漏洞详解

股票策略02 | 技术择时+行业因子+市值轮动

HTB-Mirai

开放原子全球开源峰会原圆满结束,openEuler模式得到参会者高度认可

【论文笔记】MiniSeg: An Extremely Minimum Network for Efficient COVID-19 Segmentation

Spark: Cluster Computing with Working Sets

透过现象看本质,如何针对用户做好需求分析

Chat technology in live broadcast system (8): Architecture practice of IM message module in vivo live broadcast system

SQL每日一练(牛客新题库)——第3天: 条件查询

30分钟成为Contributor|如何多方位参与OpenHarmony开源贡献?
随机推荐
测试工程师进阶必读书目
LeetCode50天刷题计划(Day 9—— 整数转罗马数字(20.40-22.10)
stm32l476芯片介绍(nvidia驱动无法找到兼容的图形硬件)
CodeForces 570D Tree Requests
2022-08-01 Daily: 18 graphs to intuitively understand neural networks, manifolds and topology
qt 通用ui
搭建ntp时间服务器(安装sql2000配置服务器失败)
长江欧拉生态创新中心成立,武汉数字经济再添坚实底座
阿里巴巴测试开发岗P6面试题
flink -redis sink 可以sink 到集群吗?
The soul asks: How does MySQL solve phantom reads?
CSDN配置功能总结
MySQL中的时区设置
datetime64[ns] converted to datetime
尾牙宴是什么
math.pow()函数用法[通俗易懂]
【二叉树】路径总和II
Distributed database problem (1): data partition
2022年5月20日最全摸鱼游戏导航
Wovent Bio IPO: Annual revenue of 480 million pension fund is a shareholder