当前位置:网站首页>3. Multi agent reinforcement learning
3. Multi agent reinforcement learning
2022-07-08 01:26:00 【C--G】
Basic concepts
Settings

- Fully Cooperative Setting


- Fully Competitive Setting


- Mixed Cooperative & Competitive


- Self-Interested Setting


Basic terminology
State,Action,State Transition

Rewards

Returns

Policy Network

Uncertainty in the Return

State-Value Function


Convergence
- Single-Agent Policy Learning

- Multi-Agent Policy Learning

- Difficulty of MARL
Single-Agent Policy Gradient for MARL



Architectures


Fully Decentralized
- Execution

- Actor-Critic Method

Fully Centralized

- Method


- Shortcoming:Slow during Execution

Centralized Training with Decentralized Execution





Parameter Sharing




边栏推荐
- Basic implementation of pie chart
- 5. Discrete control and continuous control
- Definition and classification of energy
- The communication clock (electronic time-frequency or electronic time-frequency auxiliary device) writes something casually
- HDMI to VGA acquisition HD adapter scheme | HDMI to VGA 1080p audio and video converter scheme | cs5210 scheme design explanation
- How to use education discounts to open Apple Music members for 5 yuan / month and realize member sharing
- Vscode is added to the right-click function menu
- Parade ps8625 | replace ps8625 | EDP to LVDS screen adapter or screen drive board
- Leetcode notes No.21
- 7. Regularization application
猜你喜欢

Use "recombined netlist" to automatically activate eco "APR netlist"

Ag9310 same function alternative | cs5261 replaces ag9310type-c to HDMI single switch screen alternative | low BOM replaces ag9310 design

2022 operation certificate examination for main principals of hazardous chemical business units and main principals of hazardous chemical business units

Guojingxin center "APEC investment +": some things about the Internet sector today | observation on stabilizing strategic industrial funds

Several frequently used OCR document scanning tools | no watermark | avoid IQ tax

General configuration title

8. Optimizer

Recommend a document management tool mendely Reference Manager

2022 tea master (intermediate) examination questions and tea master (intermediate) examination skills

Redis master-slave replication
随机推荐
4. Strategic Learning
4、策略学习
Use "recombined netlist" to automatically activate eco "APR netlist"
C# ?,?.,?? .....
Vscode reading Notepad Chinese display garbled code
Cs5212an design display to VGA HD adapter products | display to VGA Hd 1080p adapter products
A little experience from reading "civilization, modernization, value investment and China"
Taiwan Xinchuang sss1700 latest Chinese specification | sss1700 latest Chinese specification | sss1700datasheet Chinese explanation
Chapter 5 neural network
How to write mark down on vscode
Vs code configuration latex environment nanny level configuration tutorial (dual system)
2022 safety officer-c certificate examination summary and safety officer-c certificate reexamination examination
5、離散控制與連續控制
[loss function] entropy / relative entropy / cross entropy
The difference between distribution function and probability density function of random variables
Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
2021-03-06 - play with the application of reflection in the framework
130. Surrounding area
Redis 主从复制
Definition and classification of energy