当前位置:网站首页>Why should model.eval() be added to the pytorch test?
Why should model.eval() be added to the pytorch test?
2022-08-01 15:00:00 【passion-ma】
Many machine learning tutorials have mentioned that when using pytorch for training and testing, you must specify eval for the instantiated model, so why do you need to set model.eval() when pytorch is testing?What is the function of model.eval()?The next article tells you.
When using PyTorch for training and testing, be sure to specify train/eval for the instantiated model. When eval(), the framework will automatically fix BN and DropOut.It will not take the average, but use the trained value, otherwise, once the batch_size of the test is too small, it will easily be caused by the BN layer to generate a large color distortion of the image!!!!!!
The difference between model.eval() and with torch.no_grad()
When validating in PyTorch, model.eval() is used to switch to test mode, in which mode,
Mainly used to notify the dropout layer and batchnorm layer to switch between train and val modes
In the train mode, the dropout network layer will set the probability of retaining the activation unit according to the set parameter p (retention probability = p); the batchnorm layer will continue to calculate and update the parameters such as the mean and var of the data.
In val mode, the dropout layer will allow all activation units to pass through, while the batchnorm layer will stop computing and update mean and var, directly using what has been learned in the training phaseout the mean and var values.
This mode does not affect the gradient calculation behavior of each layer, that is, the gradient calculation and storage are the same as the training mode, but no backprobagation is performed
And with torch.no_grad() is mainly used to stop the work of the autograd module to accelerate and save video memory. The specific behavior is to stop the gradient calculation, thereby saving GPU computing power and video memory, but it does notAffects the behavior of dropout and batchnorm layers.
I don't understand why there is a difference between model.eval() and model.train() in the training and testing functions. After reviewing, I will make the following arrangements
In general, our training process is as follows:
1. Train after getting the data. During the training process, use
model.train(): Tell our network that this stage is used for training and can update parameters.
2. Prediction after training is completed. During the prediction process, use
model.eval() : Tell our network that this stage is used for testing, so the parameters of the model are not updated in this stage.
边栏推荐
- SQL查询语句之查询数据
- The role of the final keyword final and basic types, reference types
- 长江欧拉生态创新中心成立,武汉数字经济再添坚实底座
- The problem that the column becomes indexed after pd groupby and the aggregation column has no column name
- Could not write header for output file #0 (incorrect codec parameters ?): ……
- MBI5020 LED Driver
- 搭建ntp时间服务器(安装sql2000配置服务器失败)
- 分布式数据库难题(一):数据分区
- 第十三章 手动创建 REST 服务(一)
- What is a closure?
猜你喜欢
随机推荐
COPU 陆首群教授在 openEuler 社区首批高级顾问聘任仪式上发言
MySQL中的时区设置
游戏元宇宙发展趋势展望分析
SQL每日一练(牛客新题库)——第2天: 条件查询
MySQL:索引
Timezone setting in MySQL
the direction i'm looking for
flink -redis sink 可以sink 到集群吗?
mysql查询两个字段值相同的记录
RepOptimizer学习笔记
Stock Strategy 02 | Technology Timing + Industry Factors + Market Value Rotation
HTB-Mirai
什么是闭包?
final关键字的作用 final和基本类型、引用类型
what is tail tooth feast
轮询和长轮询的区别
2022-08-01 Daily: 18 graphs to intuitively understand neural networks, manifolds and topology
“查找附近的商铺”|Geohash+MySQL实现地理位置筛选
输出0-1背包问题的具体方案 ← 利用二维数组
股票预测 lstm(时间序列的预测步骤)









