当前位置:网站首页>【torch】|torch.nn.utils.clip_grad_norm_
【torch】|torch.nn.utils.clip_grad_norm_
2022-07-06 05:15:00 【rrr2】

梯度越大,total_norm值越大,进而导致clip_coef的值越小,最终也会导致对梯度的裁剪越厉害,很合理
norm_type不管取多少,对于total_norm的影响不是太大(1和2的差距稍微大一点),所以可以直接取默认值2
norm_type越大,total_norm越小(实验观察到的结论,数学不好,不会证明,所以本条不一定对)
...
loss = crit(...)
optimizer.zero_grad()
loss.backward()
torch.nn.utils.clip_grad_norm_(parameters=model.parameters(), max_norm=10, norm_type=2)
optimizer.step()
...
clip_coef越小,则对梯度的裁剪越厉害,即,使梯度的值缩小的越多
max_norm越小,clip_coef越小,所以,max_norm越大,对于梯度爆炸的解决越柔和,max_norm越小,对梯度爆炸的解决越狠.max_norm可以取小数
ref
https://blog.csdn.net/Mikeyboi/article/details/119522689
边栏推荐
- 2022半年总结
- Mysql高级篇学习总结9:创建索引、删除索引、降序索引、隐藏索引
- Zynq learning notes (3) - partial reconfiguration
- Collection + interview questions
- Pickle and savez_ Compressed compressed volume comparison
- In 2022, we must enter the big factory as soon as possible
- Sliding window problem review
- Summary of redis AOF and RDB knowledge points
- C# AES对字符串进行加密
- 關於Unity Inspector上的一些常用技巧,一般用於編輯器擴展或者其他
猜你喜欢

用StopWatch 统计代码耗时

Imperial cms7.5 imitation "D9 download station" software application download website source code
![[untitled]](/img/7e/d0724193f2f2c8681a68bda9e08289.jpg)
[untitled]

Review of double pointer problems

ByteDance program yuan teaches you how to brush algorithm questions: I'm not afraid of the interviewer tearing the code

Please wait while Jenkins is getting ready to work

Modbus protocol communication exception

指針經典筆試題

Lepton 无损压缩原理及性能分析

Talking about the type and function of lens filter
随机推荐
Lepton 无损压缩原理及性能分析
Select knowledge points of structure
On the solution of es8316's audio burst
RT thread analysis - object container implementation and function
Modbus协议通信异常
2021RoboCom机器人开发者大赛(初赛)
yolov5 tensorrt加速
Imperial cms7.5 imitation "D9 download station" software application download website source code
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
Principle and performance analysis of lepton lossless compression
Sliding window problem review
Notes, continuation, escape and other symbols
Easy to understand I2C protocol
Pickle and savez_ Compressed compressed volume comparison
Review of double pointer problems
[leetcode16] the sum of the nearest three numbers (double pointer)
【OSPF 和 ISIS 在多路访问网络中对掩码的要求】
Please wait while Jenkins is getting ready to work
MySQL if and ifnull use
JS quick start (II)