当前位置:网站首页>3. Multi agent reinforcement learning
3. Multi agent reinforcement learning
2022-07-08 01:26:00 【C--G】
Basic concepts
Settings
- Fully Cooperative Setting
- Fully Competitive Setting
- Mixed Cooperative & Competitive
- Self-Interested Setting
Basic terminology
State,Action,State Transition
Rewards
Returns
Policy Network
Uncertainty in the Return
State-Value Function
Convergence
- Single-Agent Policy Learning
- Multi-Agent Policy Learning
- Difficulty of MARL
Single-Agent Policy Gradient for MARL
Architectures
Fully Decentralized
- Execution
- Actor-Critic Method
Fully Centralized
- Method
- Shortcoming:Slow during Execution
Centralized Training with Decentralized Execution
Parameter Sharing
边栏推荐
- Definition and classification of energy
- Codeforces Round #804 (Div. 2)
- 130. 被圍繞的區域
- Different methods for setting headers of different pages in word (the same for footer and page number)
- Understanding of sidelobe cancellation
- The solution of frame dropping problem in gnuradio OFDM operation
- Share a latex online editor | with latex common templates
- 7. Regularization application
- Capstone/cs5210 chip | cs5210 design scheme | cs5210 design data
- Transportation, new infrastructure and smart highway
猜你喜欢
Use "recombined netlist" to automatically activate eco "APR netlist"
2022 high altitude installation, maintenance and demolition examination materials and high altitude installation, maintenance and demolition operation certificate examination
Gnuradio 3.9 using OOT custom module problem record
130. Surrounding area
Gnuradio3.9.4 create OOT module instances
3. MNIST dataset classification
7. Regularization application
Definition and classification of energy
[deep learning] AI one click to change the sky
Redis 主从复制
随机推荐
Frrouting BGP protocol learning
The solution of frame dropping problem in gnuradio OFDM operation
Vs code configuration latex environment nanny level configuration tutorial (dual system)
Complete model training routine
Smart grid overview
Led serial communication
Problems of font legend and time scale display of MATLAB drawing coordinate axis
Chapter IV decision tree
Probability distribution
2022 free test questions of fusion welding and thermal cutting and summary of fusion welding and thermal cutting examination
[loss function] entropy / relative entropy / cross entropy
String usage in C #
On the concept and application of filtering in radar signal processing
Codeforces Round #804 (Div. 2)
5、離散控制與連續控制
The combination of relay and led small night light realizes the control of small night light cycle on and off
130. Zones environnantes
Leetcode notes No.7
Apt get error
Basic implementation of pie chart