当前位置:网站首页>Pytorch notes: validation, model eval V.S torch. no_ grad
Pytorch notes: validation, model eval V.S torch. no_ grad
2022-06-30 10:44:00 【UQI-LIUWJ】
1 validation The general framework of
The model is model, The optimizer is optimizer
min_val_loss = np.inf
for epoch in range(1, epochs + 1):
############################ Start of training section #############################
model.train()
train_losses = []
for (batch_x, batch_y) in train_loader:
output = model(batch_x)
loss = criterion(output, batch_y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
#pytorch The third one
############################ End of training #############################
############################ The verification section begins #############################
model.eval()
for (batch_x, batch_y) in val_loader:
with torch.no_grad():
output = model(batch_x)
loss = criterion(output, batch_y)
val_losses.append(loss.item())
val_loss = np.mean(val_losses)
if val_loss < min_val_loss:
min_val_loss = val_loss
torch.save(model.state_dict(), save_path)
# Save the best model
############################ End of validation section #############################test When , You can load the parameters corresponding to the optimal model (model.load_state_dict), To test
2 model.eval() V,S with torch.no_grad()
2.1 The same thing
stay PyTorch In the middle of validation when , Use them to switch to test mode .
eg, about dropout Layer and the batchnorm layer :
- stay train In mode ,dropout The network layer will follow the set parameters p Set the probability of retaining the active unit ( Retention probability =p); batchnorm The layer will continue to compute the data mean and var And update .
- stay val In mode ,dropout The layer will let all the active units pass through , and batchnorm Layers stop calculating and updating mean and var, Use directly what you've learned in the training phase mean and var value .
2.2 Difference
- model.eval() Still involved gradient Calculation and storage of , And training Model as , It's just that there's no back propagation .
**with torch.zero_grad()** Then stop autograd Module work , That is to say stop it gradient Calculation , In order to speed up and save video memory , Thus saving GPU Computing power and memory , But it won't affect dropout and batchnorm Layer behavior .( These two are still train Mode )
——> Use the two together
边栏推荐
- & and - > priority
- 安徽《合肥市装配式建筑施工图审查设计深度要求》印发;河北衡水市调整装配式建筑预售许可标准
- Notes on numerical calculation - iterative solution of linear equations
- Dow Jones Industrial Average
- GD32 RT-Thread RTC驱动函数
- 吴恩达2022机器学习专项课测评来了!
- 【Rust日报】2021-01-22 首份Rust月刊杂志邀请大家一起参与
- MySQL从入门到精通50讲(三十二)-ScyllaDB生产环境集群搭建
- Qt之实现动效导航栏
- GD32 RT-Thread OTA/Bootloader驱动函数
猜你喜欢

吴恩达2022机器学习专项课测评来了!

ArcGIS Pro scripting tool (6) -- repairing CAD layer data sources

记一次实习的经历,趟坑必备(一)

微信推出图片大爆炸功能;苹果自研 5G 芯片或已失败;微软解决导致 Edge 停止响应的 bug|极客头条...

Machine learning interview preparation (I) KNN

腾讯云数据库工程师能力认证重磅推出,各界共话人才培养难题

ArcGIS PRO + PS vectorized land use planning map

RobotFramework学习笔记:环境安装以及robotframework-browser插件的安装

Migrate full RT thread to gd32f4xx (detailed)

WGet -- 404 not found due to spaces in URL
随机推荐
在 sCrypt 中实现高效的椭圆曲线点加法和乘法
matplotlib 笔记: contourf & contour
Implementation of iterative method for linear equations
Android 开发面试真题进阶版(附答案解析)
Apple's 5g chip was revealed to have failed in research and development, and the QQ password bug caused heated discussion. Wei Lai responded to the short selling rumors. Today, more big news is here
Voir le changement technologique à travers la Légion Huawei (5): Smart Park
Qt之实现QQ天气预报窗体翻转效果
Anhui "requirements for design depth of Hefei fabricated building construction drawing review" was printed and distributed; Hebei Hengshui city adjusts the pre-sale license standard for prefabricated
Skill combing [email protected] voice module +stm32+nfc
Skill sorting [email protected]+adxl345+ Motor vibration + serial port output
mysql数据库基础:存储过程和函数
内存逃逸分析
GD32 RT-Thread flash驱动函数
[rust daily] several new libraries were released on January 23, 2021
7 大轻量易用的工具,给开发者减压提效,助力企业敏捷上云 | Techo Day 精彩回顾...
ArcGIS Pro scripting tool (5) - delete duplicates after sorting
安徽《合肥市装配式建筑施工图审查设计深度要求》印发;河北衡水市调整装配式建筑预售许可标准
记一次实习的经历,趟坑必备(一)
MATLAB image histogram equalization, namely spatial filtering
苹果高管公然“开怼”:三星抄袭 iPhone,只加了个大屏