当前位置:网站首页>Pytorch training process was interrupted
Pytorch training process was interrupted
2022-07-05 11:17:00 【IMQYT】
I'm scared to death , Training 3 The process of Tian's model was killed by his own hand , I almost cried , Has the money for renting a server for a week been wasted , Is time wasted , Can it be remedied ! For the first time , And my code runs very slowly (RTXA5000, It's reasonable to say that it's not slow , Too much data , In order to reduce the number of logs IO Wasted time , There is no log ), Only the model is saved . Already my hands are shaking
Don't talk much , How to remedy it ?
Save the model in the code only torch.save. Other parameters are not saved .epoch Nothing is saved , Found a lot of experience , Finally find a remedy
Reload the model
path='autodl-tmp/GraphDTA-master/model_GINConvNet_kiba.model'
model.load_state_dict(torch.load(path))
In this case , What the model learned is back , Include loss And so on. .
From here I can see ,loss It did continue 294 The training of the time , The same is true of the predicted value. Continue 294 The result after the first time , Fortunately, I got it back , But there was a problem , Because I saw epoch It seems to be from 1 Here we go , In this case, we need to train 600 Time ?, So remember to revise epoch The total number of times ,600-294=306, Although the control interrupt writes this 1, But retraining 306 This time it will end . Be accomplished
边栏推荐
- Some understandings of heterogeneous graphs in DGL and the usage of heterogeneous graph convolution heterographconv
- pytorch训练进程被中断了
- Scaffold development foundation
- Golang application topic - channel
- 基础篇——REST风格开发
- 四部门:从即日起至10月底开展燃气安全“百日行动”
- 关于vray 5.2的使用(自研笔记)
- 关于vray 5.2的使用(自研笔记)(二)
- R3Live系列学习(四)R2Live源码阅读(2)
- [first release in the whole network] (tips for big tables) sometimes it takes only 1 minute for 2 hours of SQL operation
猜你喜欢
AutoCAD -- mask command, how to use CAD to locally enlarge drawings
一次edu证书站的挖掘
Do you really understand the things about "prototype"? [part I]
[office] eight usages of if function in Excel
【爬虫】wasm遇到的bug
comsol--三维图形随便画----回转
【Oracle】使用DataGrip连接Oracle数据库
DDRx寻址原理
[advertising system] incremental training & feature access / feature elimination
不要再说微服务可以解决一切问题了!
随机推荐
Summary of websites of app stores / APP markets
SLAM 01. Modeling of human recognition Environment & path
Dspic33ep clock initialization program
How can edge computing be combined with the Internet of things?
COMSOL--三维图形的建立
FreeRTOS 中 RISC-V-Qemu-virt_GCC 的调度时机
Three paradigms of database
管理多个Instagram帐户防关联小技巧大分享
Web Security
边缘计算如何与物联网结合在一起?
Explanation of message passing in DGL
Ddrx addressing principle
磨礪·聚變|知道創宇移動端官網煥新上線,開啟數字安全之旅!
【广告系统】Parameter Server分布式训练
Msfconsole command encyclopedia and instructions
DDR4的特性与电气参数
技术管理进阶——什么是管理者之体力、脑力、心力
Basics - rest style development
AUTOCAD——遮罩命令、如何使用CAD对图纸进行局部放大
解决readObjectStart: expect { or n, but found N, error found in #1 byte of ...||..., bigger context ..