当前位置:网站首页>[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated
[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated
2022-07-02 07:21:00 【lwgkzl】
problem :
stay torch Used in class nn.Parameter Declare a learnable Tensor Parameters , Results after each gradient return , You can see the gradient of the variable , But this parameter's weight Always the same , Always maintain the initial value .
Ideas :
Encounter a parameter weight Never update , There are several solutions :
1. Check whether the gradient of the variable is 0 Or for None, about pytorch The intermediate variable of , See the blog for the way to output the gradient : https://www.jianshu.com/p/ad66f2e38f2f
If it is None perhaps 0, It means that the gradient is not transferred to the variable , Output the gradient of variables down the code , Until the gradient appears , Then check why the gradient disappeared .
2. After output gradient , Check whether the gradient multiplied by the learning rate is too small , For example, the gradient is 5e-2, The learning rate is 1e-4, The value of the variable retains only five decimal places , At this time, because the learning rate is too small, the update is ignored by the variable , We need to raise the learning rate .
3. Check whether the variable is optimal step Function was previously replaced , That is, after the gradient is returned , step Function before , This parameter is reassigned .
4. most important of all , Check the class of the parameter , Did you join optimal In the optimization parameter sequence of :(, Otherwise, although the gradient returns , But the optimizer will not react to your parameters .
If it is a list of model classes : Please don't use it. list type , Use nn.ModuleList , If one list It contains three A class , hold list As B Class parameters ( stay init Function assignment ), So this list All the parameters in it (A Parameters in class ) Will not be optimized , Use nn.ModuleList This can be avoided .
There are probably so many ideas to try , If there are omissions, please pass by and correct them in the comment area .
Above
边栏推荐
- ORACLE 11G SYSAUX表空间满处理及move和shrink区别
- ORACLE EBS接口开发-json格式数据快捷生成
- Explain in detail the process of realizing Chinese text classification by CNN
- Thinkphp5中一个字段对应多个模糊查询
- 读《敏捷整洁之道:回归本源》后感
- Cognitive science popularization of middle-aged people
- php中删除指定文件夹下的内容
- MapReduce concepts and cases (Shang Silicon Valley Learning Notes)
- Classloader and parental delegation mechanism
- 矩阵的Jordan分解实例
猜你喜欢
SSM supermarket order management system
JSP智能小区物业管理系统
Feeling after reading "agile and tidy way: return to origin"
@Transational踩坑
CSRF attack
外币记账及重估总账余额表变化(下)
使用 Compose 实现可见 ScrollBar
【MEDICAL】Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
User login function: simple but difficult
Implementation of purchase, sales and inventory system with ssm+mysql
随机推荐
【调参Tricks】WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
Thinkphp5中一个字段对应多个模糊查询
php中的二维数组去重
MySQL composite index with or without ID
SSM personnel management system
使用Matlab实现:幂法、反幂法(原点位移)
php中树形结构转数组(拉平树结构,保留上下级排序)
第一个快应用(quickapp)demo
如何高效开发一款微信小程序
Data warehouse model fact table model design
sparksql数据倾斜那些事儿
CAD secondary development object
oracle EBS标准表的后缀解释说明
php中计算树状结构数据中的合计
MySQL无order by的排序规则因素
【Torch】解决tensor参数有梯度,weight不更新的若干思路
2021-07-19c CAD secondary development creates multiple line segments
A summary of a middle-aged programmer's study of modern Chinese history
One field in thinkphp5 corresponds to multiple fuzzy queries
Oracle EBS interface development - quick generation of JSON format data