当前位置:网站首页>【torch】|torch. nn. utils. clip_ grad_ norm_
【torch】|torch. nn. utils. clip_ grad_ norm_
2022-07-06 05:18:00 【rrr2】
The greater the gradient ,total_norm The bigger the value is. , Leading to clip_coef The smaller the value of , Eventually, it will also lead to the more severe clipping of the gradient , Very reasonable.
norm_type Take... No matter how much , about total_norm The impact is not too great (1 and 2 The gap is a little larger ), So you can take the default value directly 2
norm_type The bigger it is ,total_norm The smaller it is ( The conclusions observed in the experiment , Math is not good , It will not prove that , So this article is not necessarily right )
...
loss = crit(...)
optimizer.zero_grad()
loss.backward()
torch.nn.utils.clip_grad_norm_(parameters=model.parameters(), max_norm=10, norm_type=2)
optimizer.step()
...
clip_coef The smaller it is , The more severe the cutting of gradient , namely , The more you reduce the value of the gradient
max_norm The smaller it is ,clip_coef The smaller it is , therefore ,max_norm The bigger it is , The softer the solution of gradient explosion ,max_norm The smaller it is , The harder to solve the gradient explosion .max_norm You can take decimals
ref
https://blog.csdn.net/Mikeyboi/article/details/119522689
边栏推荐
- 2021robocom robot developer competition (Preliminary)
- The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
- Excel转换为Lua的配置文件
- Rce code and Command Execution Vulnerability
- 树莓派3.5寸屏幕白屏显示连接
- [buuctf.reverse] 159_[watevrCTF 2019]Watshell
- Lepton 无损压缩原理及性能分析
- Summary of redis AOF and RDB knowledge points
- Quelques conseils communs sur l'inspecteur de l'unit é, généralement pour les extensions d'éditeur ou d'autres
- Collection + interview questions
猜你喜欢
Microblogging hot search stock selection strategy
Fluent implements a loadingbutton with loading animation
The underlying structure of five data types in redis
Check the useful photo lossless magnification software on Apple computer
[untitled]
[lgr-109] Luogu may race II & windy round 6
Ad20 is set with through-hole direct connection copper sheet, and the bonding pad is cross connected
毕业设计游戏商城
初识CDN
【LeetCode】18、四数之和
随机推荐
Vulhub vulnerability recurrence 68_ ThinkPHP
图数据库ONgDB Release v-1.0.3
Biscuits (examination version)
Knowledge points of circular structure
Zoom and pan image in Photoshop 2022
Zynq learning notes (3) - partial reconfiguration
驱动开发——HelloWDM驱动
Figure database ongdb release v-1.0.3
Sliding window problem review
Yolov5 tensorrt acceleration
Can the feelings of Xi'an version of "Coca Cola" and Bingfeng beverage rush for IPO continue?
Tetris
Ad20 is set with through-hole direct connection copper sheet, and the bonding pad is cross connected
Excel转换为Lua的配置文件
Easy to understand I2C protocol
[leetcode] 18. Sum of four numbers
Postman test report
树莓派3.5寸屏幕白屏显示连接
[effective Objective-C] - memory management
【LGR-109】洛谷 5 月月赛 II & Windy Round 6