当前位置:网站首页>5. Discrete control and continuous control
5. Discrete control and continuous control
2022-07-08 01:20:00 【C--G】
Discrete VS Continuous Control
Discrete
Continuous
DQN One action, one dimension , Cannot be used for continuous control
Policy Network One action, one dimension , Cannot be used for continuous control
Must use DQN Do continuous control , It is necessary to discretize the continuous space
Better Approaches to Continuous Control
Deterministic policy network
updating Value Network by TD
Updating Policy Network by DPG
improvement:Using Target Networks
How to improve
Stochastic Policy for Continuous Control
Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation
Training Policy Network
Auxiliary Network
Policy Gradient Methods
边栏推荐
- Micro rabbit gets a field of API interface JSON
- y59.第三章 Kubernetes从入门到精通 -- 持续集成与部署(三二)
- 2022 operation certificate examination for main principals of hazardous chemical business units and main principals of hazardous chemical business units
- Share a latex online editor | with latex common templates
- Recommend a document management tool mendely Reference Manager
- C#中string用法
- Chapter 5 neural network
- Vscode reading Notepad Chinese display garbled code
- 5. Over fitting, dropout, regularization
- Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
猜你喜欢
130. 被围绕的区域
6.Dropout应用
图像数据预处理
Chapter VIII integrated learning
y59.第三章 Kubernetes从入门到精通 -- 持续集成与部署(三二)
Ag9310 same function alternative | cs5261 replaces ag9310type-c to HDMI single switch screen alternative | low BOM replaces ag9310 design
Basic implementation of pie chart
On the concept and application of filtering in radar signal processing
Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
利用GPU训练网络模型
随机推荐
Redis 主从复制
Blue Bridge Cup embedded (F103) -1 STM32 clock operation and led operation method
FIR filter of IQ signal after AD phase discrimination
5. Over fitting, dropout, regularization
5.过拟合,dropout,正则化
9.卷积神经网络介绍
AI遮天传 ML-回归分析入门
Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
Multi purpose signal modulation generation system based on environmental optical signal detection and user-defined signal rules
Ag9311maq design 100W USB type C docking station data | ag9311maq is used for 100W USB type C to HDMI with PD fast charging +u3+sd/cf docking station scheme description
Continued from the previous design
50MHz generation time
14.绘制网络模型结构
The communication clock (electronic time-frequency or electronic time-frequency auxiliary device) writes something casually
Understanding of sidelobe cancellation
Led serial communication
4. Apprentissage stratégique
完整的模型训练套路
Image data preprocessing
EDP to LVDS conversion design circuit | EDP to LVDS adapter board circuit | capstone/cs5211 chip circuit schematic reference