当前位置:网站首页>5. Contrôle discret et contrôle continu
5. Contrôle discret et contrôle continu
2022-07-08 01:20:00 【C - - G】
Discrete VS Continuous Control
Discrete
Continuous
DQNUne action, une dimension,Ne peut pas être utilisé pour le contrôle continu
Policy NetworkUne action, une dimension,Ne peut pas être utilisé pour le contrôle continu
Je dois utiliserDQNContrôle continu,Il s'agit de discrétiser l'espace continu

Better Approaches to Continuous Control
Deterministic policy network




updating Value Network by TD

Updating Policy Network by DPG



improvement:Using Target Networks





Méthode de levage

Stochastic Policy for Continuous Control



Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation


Training Policy Network

Auxiliary Network









Policy Gradient Methods




边栏推荐
- Vscode reading Notepad Chinese display garbled code
- 13. Model saving and loading
- 133. 克隆图
- Chapter VIII integrated learning
- 4. Apprentissage stratégique
- 跨模态语义关联对齐检索-图像文本匹配(Image-Text Matching)
- Parade ps8625 | replace ps8625 | EDP to LVDS screen adapter or screen drive board
- Codeforces Round #804 (Div. 2)
- 解决报错:npm WARN config global `--global`, `--local` are deprecated. Use `--location=global` instead.
- [deep learning] AI one click to change the sky
猜你喜欢

Smart grid overview

130. 被围绕的区域

12.RNN应用于手写数字识别

Chapter XI feature selection

Several frequently used OCR document scanning tools | no watermark | avoid IQ tax

Markdown learning (entry level)

Micro rabbit gets a field of API interface JSON

2. Nonlinear regression

HDMI to VGA acquisition HD adapter scheme | HDMI to VGA 1080p audio and video converter scheme | cs5210 scheme design explanation

Ag9311maq design 100W USB type C docking station data | ag9311maq is used for 100W USB type C to HDMI with PD fast charging +u3+sd/cf docking station scheme description
随机推荐
7. Regularization application
Chapter XI feature selection
Vscode reading Notepad Chinese display garbled code
133. Clone map
50Mhz产生时间
Arm bare metal
Use "recombined netlist" to automatically activate eco "APR netlist"
4. Cross entropy
完整的模型验证(测试,demo)套路
12.RNN应用于手写数字识别
General configuration toolbox
解决报错:npm WARN config global `--global`, `--local` are deprecated. Use `--location=global` instead.
14.绘制网络模型结构
Complete model training routine
Leetcode notes No.21
[deep learning] AI one click to change the sky
Macro definition and multiple parameters
Fundamentals - integrating third-party technology
4. Apprentissage stratégique
Prediction of the victory or defeat of the League of heroes -- simple KFC Colonel