当前位置:网站首页>5、離散控制與連續控制
5、離散控制與連續控制
2022-07-08 01:19:00 【C--G】
Discrete VS Continuous Control
Discrete
Continuous
DQN一個動作一個維度,不能用於連續控制
Policy Network一個動作一個維度,不能用於連續控制
非要用DQN做連續控制,就要將連續空間離散化
Better Approaches to Continuous Control
Deterministic policy network
updating Value Network by TD
Updating Policy Network by DPG
improvement:Using Target Networks
提昇方法
Stochastic Policy for Continuous Control
Policy Network
Univariate Normal Distribution
Multivariate Normal Distribution
Function Approximation
Training Policy Network
Auxiliary Network
Policy Gradient Methods
边栏推荐
- 130. Zones environnantes
- 133. Clone map
- Micro rabbit gets a field of API interface JSON
- Scheme selection and scheme design of multifunctional docking station for type C to VGA HDMI audio and video launched by ange in Taiwan | scheme selection and scheme explanation of usb-c to VGA HDMI c
- Basic realization of line chart (II)
- High quality USB sound card / audio chip sss1700 | sss1700 design 96 kHz 24 bit sampling rate USB headset microphone scheme | sss1700 Chinese design scheme explanation
- Cs5212an design display to VGA HD adapter products | display to VGA Hd 1080p adapter products
- 第四期SFO销毁,Starfish OS如何对SFO价值赋能?
- 图像数据预处理
- Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
猜你喜欢
4. Cross entropy
解决报错:npm WARN config global `--global`, `--local` are deprecated. Use `--location=global` instead.
HDMI to VGA acquisition HD adapter scheme | HDMI to VGA 1080p audio and video converter scheme | cs5210 scheme design explanation
On the concept and application of filtering in radar signal processing
13.模型的保存和载入
2021-03-14 - play with generics
8.优化器
Generic configuration legend
4.交叉熵
y59.第三章 Kubernetes从入门到精通 -- 持续集成与部署(三二)
随机推荐
Scheme selection and scheme design of multifunctional docking station for type C to VGA HDMI audio and video launched by ange in Taiwan | scheme selection and scheme explanation of usb-c to VGA HDMI c
4. Cross entropy
How to write mark down on vscode
Su embedded training - C language programming practice (implementation of address book)
Definition and classification of energy
Ag7120 and ag7220 explain the driving scheme of HDMI signal extension amplifier | ag7120 and ag7220 design HDMI signal extension amplifier circuit reference
FIR filter of IQ signal after AD phase discrimination
Binder core API
133. Clone map
Study notes of single chip microcomputer and embedded system
130. 被围绕的区域
Markdown learning (entry level)
C# ?,?.,?? .....
Common fault analysis and Countermeasures of using MySQL in go language
Overall introduction of the project
4、策略學習
Blue Bridge Cup embedded (F103) -1 STM32 clock operation and led operation method
Chapter XI feature selection
Vscode reading Notepad Chinese display garbled code
Chapter 16 intensive learning