当前位置:网站首页>5、离散控制与连续控制
5、离散控制与连续控制
2022-07-07 23:21:00 【C--G】
Discrete VS Continuous Control
Discrete
Continuous
DQN一个动作一个维度,不能用于连续控制
Policy Network一个动作一个维度,不能用于连续控制
非要用DQN做连续控制,就要将连续空间离散化
Better Approaches to Continuous Control
Deterministic policy network
updating Value Network by TD
Updating Policy Network by DPG
improvement:Using Target Networks
提升方法
Stochastic Policy for Continuous Control
Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation
Training Policy Network
Auxiliary Network
Policy Gradient Methods
边栏推荐
- New library online | cnopendata China Star Hotel data
- Codeforces Round #804 (Div. 2)
- 跨模态语义关联对齐检索-图像文本匹配(Image-Text Matching)
- 完整的模型验证(测试,demo)套路
- 50MHz generation time
- How to use education discounts to open Apple Music members for 5 yuan / month and realize member sharing
- 国内首次,3位清华姚班本科生斩获STOC最佳学生论文奖
- NVIDIA Jetson test installation yolox process record
- Su embedded training - Day9
- Vscode reading Notepad Chinese display garbled code
猜你喜欢
2. Nonlinear regression
New library launched | cnopendata China Time-honored enterprise directory
Semantic segmentation model base segmentation_ models_ Detailed introduction to pytorch
5. Over fitting, dropout, regularization
Common effects of line chart
利用GPU训练网络模型
3.MNIST数据集分类
Led serial communication
Chapter 5 neural network
Complete model verification (test, demo) routine
随机推荐
How does starfish OS enable the value of SFO in the fourth phase of SFO destruction?
Saving and reading of network model
Common configurations in rectangular coordinate system
130. 被围绕的区域
6.Dropout应用
[deep learning] AI one click to change the sky
大二级分类产品页权重低,不收录怎么办?
9.卷积神经网络介绍
4. Cross entropy
swift获取url参数
1.线性回归
Ag7120 and ag7220 explain the driving scheme of HDMI signal extension amplifier | ag7120 and ag7220 design HDMI signal extension amplifier circuit reference
12. RNN is applied to handwritten digit recognition
Ag9311maq design 100W USB type C docking station data | ag9311maq is used for 100W USB type C to HDMI with PD fast charging +u3+sd/cf docking station scheme description
Codeforces Round #804 (Div. 2)
AI遮天传 ML-初识决策树
Vs code configuration latex environment nanny level configuration tutorial (dual system)
12.RNN应用于手写数字识别
Chapter IV decision tree
国内首次,3位清华姚班本科生斩获STOC最佳学生论文奖