当前位置:网站首页>Why should model.eval() be added to the pytorch test?
Why should model.eval() be added to the pytorch test?
2022-08-01 15:00:00 【passion-ma】
Many machine learning tutorials have mentioned that when using pytorch for training and testing, you must specify eval for the instantiated model, so why do you need to set model.eval() when pytorch is testing?What is the function of model.eval()?The next article tells you.
When using PyTorch for training and testing, be sure to specify train/eval for the instantiated model. When eval(), the framework will automatically fix BN and DropOut.It will not take the average, but use the trained value, otherwise, once the batch_size of the test is too small, it will easily be caused by the BN layer to generate a large color distortion of the image!!!!!!
The difference between model.eval() and with torch.no_grad()
When validating in PyTorch, model.eval() is used to switch to test mode, in which mode,
Mainly used to notify the dropout layer and batchnorm layer to switch between train and val modes
In the train mode, the dropout network layer will set the probability of retaining the activation unit according to the set parameter p (retention probability = p); the batchnorm layer will continue to calculate and update the parameters such as the mean and var of the data.
In val mode, the dropout layer will allow all activation units to pass through, while the batchnorm layer will stop computing and update mean and var, directly using what has been learned in the training phaseout the mean and var values.
This mode does not affect the gradient calculation behavior of each layer, that is, the gradient calculation and storage are the same as the training mode, but no backprobagation is performed
And with torch.no_grad() is mainly used to stop the work of the autograd module to accelerate and save video memory. The specific behavior is to stop the gradient calculation, thereby saving GPU computing power and video memory, but it does notAffects the behavior of dropout and batchnorm layers.
I don't understand why there is a difference between model.eval() and model.train() in the training and testing functions. After reviewing, I will make the following arrangements
In general, our training process is as follows:
1. Train after getting the data. During the training process, use
model.train()
: Tell our network that this stage is used for training and can update parameters.
2. Prediction after training is completed. During the prediction process, use
model.eval()
: Tell our network that this stage is used for testing, so the parameters of the model are not updated in this stage.
边栏推荐
- 【二叉树】路径总和II
- gconf/dconf实战编程(2)利用gconf库读写配置实战以及诸多配套工具演示
- MySQL中的存储过程(详细篇)
- MySQL中根据日期进行范围查询
- Longkou united chemical registration: through 550 million revenue xiu-mei li control 92.5% stake
- 如何使用 Mashup 技术在 SAP Cloud for Customer 页面嵌入自定义 UI
- 测试工程师进阶必读书目
- Next-ViT学习笔记
- The role of the final keyword final and basic types, reference types
- 大神们,ODPS用的是MySQL吗?
猜你喜欢
随机推荐
final关键字的作用 final和基本类型、引用类型
1161. 最大层内元素和
反序列化漏洞详解
SQL每日一练(牛客新题库)——第2天: 条件查询
2022-08-01 Daily: 18 graphs to intuitively understand neural networks, manifolds and topology
五分钟带你上手ShardingJDBC实现MySQL分库分表
CodeForces 570D Tree Requests
直播系统聊天技术(八):vivo直播系统中IM消息模块的架构实践
[Binary Tree] Path Sum II
Row locks in MySQL
gconf/dconf实战编程(2)利用gconf库读写配置实战以及诸多配套工具演示
设计专业第一台笔记本 华硕灵耀Pro16 2022 新品首发超值入手
立新能源深交所上市:市值55亿 哈密国投与国有基金是股东
String comparison size in MySQL (date string comparison problem)
游戏元宇宙发展趋势展望分析
uniapp 获取cookie与携带cookie请求数据
c语言rand函数生成随机数,详解C语言生成随机数rand函数的用法[通俗易懂]
解读selenium webdriver
沃文特生物IPO过会:年营收4.8亿 养老基金是股东
荣信文化通过注册:年营收3.8亿 王艺桦夫妇为实控人