当前位置：网站首页>[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated

[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated

2022-07-02 07:21:00 【lwgkzl】

problem ：

stay torch Used in class nn.Parameter Declare a learnable Tensor Parameters , Results after each gradient return , You can see the gradient of the variable , But this parameter's weight Always the same , Always maintain the initial value .

Ideas ：

Encounter a parameter weight Never update , There are several solutions ：

1. Check whether the gradient of the variable is 0 Or for None, about pytorch The intermediate variable of , See the blog for the way to output the gradient ： https://www.jianshu.com/p/ad66f2e38f2f

If it is None perhaps 0, It means that the gradient is not transferred to the variable , Output the gradient of variables down the code , Until the gradient appears , Then check why the gradient disappeared .

2. After output gradient , Check whether the gradient multiplied by the learning rate is too small , For example, the gradient is 5e-2, The learning rate is 1e-4, The value of the variable retains only five decimal places , At this time, because the learning rate is too small, the update is ignored by the variable , We need to raise the learning rate .

3. Check whether the variable is optimal step Function was previously replaced , That is, after the gradient is returned , step Function before , This parameter is reassigned .

4. most important of all , Check the class of the parameter , Did you join optimal In the optimization parameter sequence of ：(, Otherwise, although the gradient returns , But the optimizer will not react to your parameters .

If it is a list of model classes ： Please don't use it. list type , Use nn.ModuleList , If one list It contains three A class , hold list As B Class parameters ( stay init Function assignment ), So this list All the parameters in it (A Parameters in class ) Will not be optimized , Use nn.ModuleList This can be avoided .

There are probably so many ideas to try , If there are omissions, please pass by and correct them in the comment area .

Above

原网站

版权声明
本文为[lwgkzl]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/183/202207020622420019.html

当前位置：网站首页>[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated

[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated

problem ：

Ideas ：

边栏推荐

猜你喜欢

随机推荐