当前位置：网站首页>Pytorch training process was interrupted

Pytorch training process was interrupted

2022-07-05 11:17:00 【IMQYT】

I'm scared to death , Training 3 The process of Tian's model was killed by his own hand , I almost cried , Has the money for renting a server for a week been wasted , Is time wasted , Can it be remedied ！ For the first time , And my code runs very slowly （RTXA5000, It's reasonable to say that it's not slow , Too much data , In order to reduce the number of logs IO Wasted time , There is no log ）, Only the model is saved . Already my hands are shaking

Don't talk much , How to remedy it ？

Save the model in the code only torch.save. Other parameters are not saved .epoch Nothing is saved , Found a lot of experience , Finally find a remedy

Reload the model

        path='autodl-tmp/GraphDTA-master/model_GINConvNet_kiba.model'
        model.load_state_dict(torch.load(path))

In this case , What the model learned is back , Include loss And so on. .

From here I can see ,loss It did continue 294 The training of the time , The same is true of the predicted value. Continue 294 The result after the first time , Fortunately, I got it back , But there was a problem , Because I saw epoch It seems to be from 1 Here we go , In this case, we need to train 600 Time ？, So remember to revise epoch The total number of times ,600-294=306, Although the control interrupt writes this 1, But retraining 306 This time it will end . Be accomplished

原网站

版权声明
本文为[IMQYT]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/186/202207051113360731.html

当前位置：网站首页>Pytorch training process was interrupted

Pytorch training process was interrupted

边栏推荐

猜你喜欢

随机推荐