当前位置:网站首页>5. Discrete control and continuous control
5. Discrete control and continuous control
2022-07-08 01:20:00 【C--G】
Discrete VS Continuous Control
Discrete
Continuous
DQN One action, one dimension , Cannot be used for continuous control
Policy Network One action, one dimension , Cannot be used for continuous control
Must use DQN Do continuous control , It is necessary to discretize the continuous space
Better Approaches to Continuous Control
Deterministic policy network
updating Value Network by TD
Updating Policy Network by DPG
improvement:Using Target Networks
How to improve
Stochastic Policy for Continuous Control
Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation
Training Policy Network
Auxiliary Network
Policy Gradient Methods
边栏推荐
- Ag9310 same function alternative | cs5261 replaces ag9310type-c to HDMI single switch screen alternative | low BOM replaces ag9310 design
- Su embedded training - C language programming practice (implementation of address book)
- 2021-03-14 - play with generics
- Common configurations in rectangular coordinate system
- Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
- Chapter VIII integrated learning
- 133. Clone map
- 2、TD+Learning
- For the first time in China, three Tsinghua Yaoban undergraduates won the stoc best student thesis award
- 图像数据预处理
猜你喜欢
国内首次,3位清华姚班本科生斩获STOC最佳学生论文奖
1. Linear regression
The Ministry of housing and urban rural development officially issued the technical standard for urban information model (CIM) basic platform, which will be implemented from June 1
How to transfer Netease cloud music /qq music to Apple Music
Chapter XI feature selection
7.正则化应用
Complete model verification (test, demo) routine
Ag7120 and ag7220 explain the driving scheme of HDMI signal extension amplifier | ag7120 and ag7220 design HDMI signal extension amplifier circuit reference
10. CNN applied to handwritten digit recognition
Application of state mode in JSF source code
随机推荐
Four digit nixie tube display multi digit timing
Basic realization of line graph
Chapter XI feature selection
Parade ps8625 | replace ps8625 | EDP to LVDS screen adapter or screen drive board
12. RNN is applied to handwritten digit recognition
Ag9310 for type-C docking station scheme circuit design method | ag9310 for type-C audio and video converter scheme circuit design reference
Chapter 7 Bayesian classifier
Basic implementation of pie chart
Cs5212an design display to VGA HD adapter products | display to VGA Hd 1080p adapter products
130. Surrounding area
[reprint] solve the problem that CONDA installs pytorch too slowly
Ag9310meq ag9310mfq angle two USB type C to HDMI audio and video data conversion function chips parameter difference and design circuit reference
Recommend a document management tool mendely Reference Manager
10.CNN应用于手写数字识别
2021-04-12 - new features lambda expression and function functional interface programming
完整的模型训练套路
Taiwan Xinchuang sss1700 latest Chinese specification | sss1700 latest Chinese specification | sss1700datasheet Chinese explanation
[deep learning] AI one click to change the sky
Study notes of single chip microcomputer and embedded system
基础篇——整合第三方技术