当前位置:网站首页>Gradient clip in dqn
Gradient clip in dqn
2022-06-30 05:15:00 【hanjialeOK】
First look at this https://stackoverflow.com/questions/36462962/loss-clipping-in-tensor-flow-on-deepminds-dqn
DQN As mentioned in the article clip Not at all gradient clip.
Let's look at it first tensorflow 1 Medium huber_loss, Make d = 1.
0.5 * x^2 if |x| <= d
0.5 * d^2 + d * (|x| - d) if |x| > d
The derivative is
f'(x) = x if x in [-1, 1]
f'(x) = +1 if x > +1
f'(x) = -1 if x < -1
l o s s = f ( y i − y i ^ ) l o s s ′ = f ′ ( y i − y i ^ ) ⋅ y i ′ = f ′ ( z ) ⋅ y i ′ loss = f(y_i-\hat{y_i}) \\ loss'=f'(y_i-\hat{y_i})\cdot y_i'=f'(z)\cdot y_i' loss=f(yi−yi^)loss′=f′(yi−yi^)⋅yi′=f′(z)⋅yi′
DQN Mentioned in the text
We also found it helpful to clip the error term from the update to be between −1 and 1. Because the absolute value loss function |x| has a derivative of −1 for all negative values of x and a derivative of 1 for all positive values of x, clipping the squared error to be between −1 and 1 corresponds to using an absolute value loss function for errors outside of the (−1,1) interval. This form of error clipping further improved the stability of the algorithm.
But this means f ′ ( z ) f'(z) f′(z) be in -1 and 1 Between . And the gradient clip Then it is f ′ ( z ) ⋅ y i ′ f'(z)\cdot y'_i f′(z)⋅yi′ The whole is in -1 To 1 Between .
边栏推荐
- [notes] unity Scrollview button page turning
- Nestjs入门和环境搭建
- Unity obtains serial port data
- Harbor API 2.0 query
- Ripple effect of mouse click (unity & shader)
- Untiy3d controls scene screenshots through external JSON files
- Some problems encountered in unity steamvr
- How to use js to control the scroll bar of moving div
- RedisTemplate 常用方法汇总
- Virtual and pure virtual destructions
猜你喜欢

Unity packaging and publishing webgl error reason exception: failed building webgl player

pycharm 数据库工具

Li Kou 2049: count the number of nodes with the highest score

Intellj idea jars projects containing external lib to other project reference methods - jars

【VCS+Verdi联合仿真】~ 以计数器为例

Force buckle 704 Binary search
![[typescript] cannot redeclare block range variables](/img/52/2fd3071ca9e3c5023c6b65961e2cf7.jpg)
[typescript] cannot redeclare block range variables

MinGW-w64下载文件失败the file has been downloaded incorrectly!

How does unity use mapbox to implement real maps in games?

Unity ugui text value suspended enlarged display add text background
随机推荐
Unity project hosting platform plasticscm (learn to use 1)
Unity3d realizes Google Digital Earth
[typescript] defines the return value type of promise
Virtual and pure virtual destructions
Unity animator does not clip animation to play animation in segments
Unityshader learning notes - Basic Attributes
Operation file file class method
Detailed explanation of the loss module of mmdet
OpenGL draws model on QT platform to solve the problem of initializing VAO and VBO
Unity download and installation website
Connect() and disconnect() of socket in C #
Does the tester need to analyze the cause of the bug?
Golan no tests were run: fmt Printf() &lt; BUG&gt;
使用码云PublicHoliday项目判断某天是否为工作日
Unity3d learning notes-1 (C # learning)
Unity/ue reads OPC UA and OPC Da data (UE4)
Procedural animation -- inverse kinematics of tentacles
东塔攻防世界—xss绕过安全狗
Rotation, translation and scaling of unity VR objects
Unity scroll view element drag and drop to automatically adsorb centering and card effect