当前位置:网站首页>【torch】|torch. nn. utils. clip_ grad_ norm_
【torch】|torch. nn. utils. clip_ grad_ norm_
2022-07-06 05:18:00 【rrr2】

The greater the gradient ,total_norm The bigger the value is. , Leading to clip_coef The smaller the value of , Eventually, it will also lead to the more severe clipping of the gradient , Very reasonable.
norm_type Take... No matter how much , about total_norm The impact is not too great (1 and 2 The gap is a little larger ), So you can take the default value directly 2
norm_type The bigger it is ,total_norm The smaller it is ( The conclusions observed in the experiment , Math is not good , It will not prove that , So this article is not necessarily right )
...
loss = crit(...)
optimizer.zero_grad()
loss.backward()
torch.nn.utils.clip_grad_norm_(parameters=model.parameters(), max_norm=10, norm_type=2)
optimizer.step()
...
clip_coef The smaller it is , The more severe the cutting of gradient , namely , The more you reduce the value of the gradient
max_norm The smaller it is ,clip_coef The smaller it is , therefore ,max_norm The bigger it is , The softer the solution of gradient explosion ,max_norm The smaller it is , The harder to solve the gradient explosion .max_norm You can take decimals
ref
https://blog.csdn.net/Mikeyboi/article/details/119522689
边栏推荐
- Cve-2019-11043 (PHP Remote Code Execution Vulnerability)
- Flody的应用
- Safe mode on Windows
- nacos-高可用seata之TC搭建(02)
- flutter 实现一个有加载动画的按钮(loadingButton)
- Huawei equipment is configured with OSPF and BFD linkage
- EditorUtility.SetDirty在Untiy中的作用以及应用
- GAMES202-WebGL中shader的編譯和連接(了解向)
- Mysql高级篇学习总结9:创建索引、删除索引、降序索引、隐藏索引
- Vulhub vulnerability recurrence 67_ Supervisor
猜你喜欢

Vulhub vulnerability recurrence 67_ Supervisor

flutter 实现一个有加载动画的按钮(loadingButton)
![[leetcode daily question] number of enclaves](/img/6e/1da0fa5c7d1489ba555e4791e2ac97.jpg)
[leetcode daily question] number of enclaves

Three methods of Oracle two table Association update

【LGR-109】洛谷 5 月月赛 II & Windy Round 6

趋势前沿 | 达摩院语音 AI 最新技术大全

Can the feelings of Xi'an version of "Coca Cola" and Bingfeng beverage rush for IPO continue?

Postman pre script - global variables and environment variables

Acwing week 58

Pix2pix: image to image conversion using conditional countermeasure networks
随机推荐
Ad20 is set with through-hole direct connection copper sheet, and the bonding pad is cross connected
[noip2009 popularization group] score line delimitation
Questions d'examen écrit classiques du pointeur
注释、接续、转义等符号
用StopWatch 统计代码耗时
What are the advantages of the industry private network over the public network? What specific requirements can be met?
Oracle deletes duplicate data, leaving only one
[mask requirements of OSPF and Isis in multi access network]
Huawei equipment is configured with OSPF and BFD linkage
[untitled]
Three. JS learning - light and shadow (understanding)
Compilation and connection of shader in games202 webgl (learn from)
[buuctf.reverse] 159_ [watevrCTF 2019]Watshell
Postman Association
Postman test report
Fuzzy -- basic application method of AFL
flutter 实现一个有加载动画的按钮(loadingButton)
關於Unity Inspector上的一些常用技巧,一般用於編輯器擴展或者其他
Simple understanding of interpreters and compilers
Summary of redis basic knowledge points