当前位置:网站首页>5、离散控制与连续控制
5、离散控制与连续控制
2022-07-07 23:21:00 【C--G】
Discrete VS Continuous Control
Discrete
Continuous
DQN一个动作一个维度,不能用于连续控制
Policy Network一个动作一个维度,不能用于连续控制
非要用DQN做连续控制,就要将连续空间离散化
Better Approaches to Continuous Control
Deterministic policy network
updating Value Network by TD
Updating Policy Network by DPG
improvement:Using Target Networks
提升方法
Stochastic Policy for Continuous Control
Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation
Training Policy Network
Auxiliary Network
Policy Gradient Methods
边栏推荐
- jemter分布式
- 【深度学习】AI一键换天
- 5. Over fitting, dropout, regularization
- [deep learning] AI one click to change the sky
- Complete model training routine
- Led serial communication
- swift获取url参数
- Chapter VIII integrated learning
- Chapter 7 Bayesian classifier
- For the first time in China, three Tsinghua Yaoban undergraduates won the stoc best student thesis award
猜你喜欢
Cross modal semantic association alignment retrieval - image text matching
新库上线 | CnOpenData中国星级酒店数据
Image data preprocessing
Smart grid overview
AI遮天传 ML-初识决策树
Complete model verification (test, demo) routine
130. Zones environnantes
Ag9310 design USB type C to hdmi+u2+5v slow charging scheme design | ag9310 expansion dock scheme circuit | type-C dongle design data
Jemter distributed
4. Cross entropy
随机推荐
HDMI to VGA acquisition HD adapter scheme | HDMI to VGA 1080p audio and video converter scheme | cs5210 scheme design explanation
Cs5261type-c to HDMI alternative ag9310 | ag9310 alternative
Chapter 16 intensive learning
Su embedded training - Day8
Authorization code of Axure rp9
swift获取url参数
Complete model training routine
Use "recombined netlist" to automatically activate eco "APR netlist"
Two methods for full screen adaptation of background pictures, background size: cover; Or (background size: 100% 100%;)
EDP to LVDS conversion design circuit | EDP to LVDS adapter board circuit | capstone/cs5211 chip circuit schematic reference
Chapter XI feature selection
Introduction to the types and repair methods of chip Eco
50MHz generation time
Capstone/cs5210 chip | cs5210 design scheme | cs5210 design data
2. Nonlinear regression
Su embedded training - Day6
7. Regularization application
Cs5212an design display to VGA HD adapter products | display to VGA Hd 1080p adapter products
Recommend a document management tool mendely Reference Manager
8. Optimizer