当前位置:网站首页>5. Discrete control and continuous control
5. Discrete control and continuous control
2022-07-08 01:20:00 【C--G】
Discrete VS Continuous Control
Discrete
Continuous
DQN One action, one dimension , Cannot be used for continuous control 
Policy Network One action, one dimension , Cannot be used for continuous control 
Must use DQN Do continuous control , It is necessary to discretize the continuous space 

Better Approaches to Continuous Control
Deterministic policy network




updating Value Network by TD

Updating Policy Network by DPG



improvement:Using Target Networks





How to improve 

Stochastic Policy for Continuous Control



Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation


Training Policy Network

Auxiliary Network









Policy Gradient Methods




边栏推荐
- Markdown learning (entry level)
- 6. Dropout application
- 解决报错:npm WARN config global `--global`, `--local` are deprecated. Use `--location=global` instead.
- Su embedded training - Day8
- Su embedded training - Day6
- 5. Over fitting, dropout, regularization
- Frrouting BGP protocol learning
- 9. Introduction to convolutional neural network
- Connect to the previous chapter of the circuit to improve the material draft
- 3. MNIST dataset classification
猜你喜欢

网络模型的保存与读取

Know how to get the traffic password

Two methods for full screen adaptation of background pictures, background size: cover; Or (background size: 100% 100%;)

Taiwan Xinchuang sss1700 latest Chinese specification | sss1700 latest Chinese specification | sss1700datasheet Chinese explanation

Measure the voltage with analog input (taking Arduino as an example, the range is about 1KV)

1. Linear regression

4.交叉熵

The combination of relay and led small night light realizes the control of small night light cycle on and off

2. Nonlinear regression

AI遮天传 ML-回归分析入门
随机推荐
2021 welder (primary) examination skills and welder (primary) operation examination question bank
Application of state mode in JSF source code
Smart agricultural technology framework
Two methods for full screen adaptation of background pictures, background size: cover; Or (background size: 100% 100%;)
Chapter 5 neural network
130. Zones environnantes
12.RNN应用于手写数字识别
10.CNN应用于手写数字识别
Continued from the previous design
11. Recurrent neural network RNN
9.卷积神经网络介绍
13.模型的保存和载入
Solve the error: NPM warn config global ` --global`, `--local` are deprecated Use `--location=global` instead.
Chapter 16 intensive learning
Cs5261type-c to HDMI alternative ag9310 | ag9310 alternative
The whole life cycle of commodity design can be included in the scope of industrial Internet
2021-04-12 - new features lambda expression and function functional interface programming
2022 operation certificate examination for main principals of hazardous chemical business units and main principals of hazardous chemical business units
8.优化器
2. Nonlinear regression