当前位置:网站首页>[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated
[torch] some ideas to solve the problem that the tensor parameters have gradients and the weight is not updated
2022-07-02 07:21:00 【lwgkzl】
problem :
stay torch Used in class nn.Parameter Declare a learnable Tensor Parameters , Results after each gradient return , You can see the gradient of the variable , But this parameter's weight Always the same , Always maintain the initial value .
Ideas :
Encounter a parameter weight Never update , There are several solutions :
1. Check whether the gradient of the variable is 0 Or for None, about pytorch The intermediate variable of , See the blog for the way to output the gradient : https://www.jianshu.com/p/ad66f2e38f2f
If it is None perhaps 0, It means that the gradient is not transferred to the variable , Output the gradient of variables down the code , Until the gradient appears , Then check why the gradient disappeared .
2. After output gradient , Check whether the gradient multiplied by the learning rate is too small , For example, the gradient is 5e-2, The learning rate is 1e-4, The value of the variable retains only five decimal places , At this time, because the learning rate is too small, the update is ignored by the variable , We need to raise the learning rate .
3. Check whether the variable is optimal step Function was previously replaced , That is, after the gradient is returned , step Function before , This parameter is reassigned .
4. most important of all , Check the class of the parameter , Did you join optimal In the optimization parameter sequence of :(, Otherwise, although the gradient returns , But the optimizer will not react to your parameters .
If it is a list of model classes : Please don't use it. list type , Use nn.ModuleList , If one list It contains three A class , hold list As B Class parameters ( stay init Function assignment ), So this list All the parameters in it (A Parameters in class ) Will not be optimized , Use nn.ModuleList This can be avoided .
There are probably so many ideas to try , If there are omissions, please pass by and correct them in the comment area .
Above
边栏推荐
- Oracle RMAN semi automatic recovery script restore phase
- pySpark构建临时表报错
- MySQL has no collation factor of order by
- SSM二手交易网站
- Check log4j problems using stain analysis
- Oracle general ledger balance table GL for foreign currency bookkeeping_ Balance change (Part 1)
- oracle-外币记账时总账余额表gl_balance变化(上)
- ORACLE APEX 21.2安装及一键部署
- Oracle segment advisor, how to deal with row link row migration, reduce high water level
- 使用Matlab实现:幂法、反幂法(原点位移)
猜你喜欢
Ceaspectuss shipping company shipping artificial intelligence products, anytime, anywhere container inspection and reporting to achieve cloud yard, shipping company intelligent digital container contr
Write a thread pool by hand, and take you to learn the implementation principle of ThreadPoolExecutor thread pool
How to efficiently develop a wechat applet
Ding Dong, here comes the redis om object mapping framework
User login function: simple but difficult
ORACLE 11G利用 ORDS+pljson来实现json_table 效果
外币记账及重估总账余额表变化(下)
第一个快应用(quickapp)demo
The first quickapp demo
SSM second hand trading website
随机推荐
SSM实验室设备管理
Oracle EBS DataGuard setup
使用Matlab实现:幂法、反幂法(原点位移)
使用Matlab实现:弦截法、二分法、CG法,求零点、解方程
php中判断版本号是否连续
view的绘制机制(三)
叮咚,Redis OM对象映射框架来了
Oracle general ledger balance table GL for foreign currency bookkeeping_ Balance change (Part 1)
【BERT,GPT+KG调研】Pretrain model融合knowledge的论文集锦
Three principles of architecture design
Module not found: Error: Can't resolve './$$_ gendir/app/app. module. ngfactory'
Find in laravel8_ in_ Usage of set and upsert
使用Matlab实现:Jacobi、Gauss-Seidel迭代
Oracle段顾问、怎么处理行链接行迁移、降低高水位
php中计算树状结构数据中的合计
Proteus -- RS-232 dual computer communication
JSP智能小区物业管理系统
Message queue fnd in Oracle EBS_ msg_ pub、fnd_ Application of message in pl/sql
sparksql数据倾斜那些事儿
SSM laboratory equipment management