当前位置:网站首页>Hunan University | robust Multi-Agent Reinforcement Learning in noisy environment
Hunan University | robust Multi-Agent Reinforcement Learning in noisy environment
2022-07-04 01:46:00 【Zhiyuan community】
Despite recent intensive learning (RL) Progress has been made in , But by RL Trained agents are usually sensitive to the environment , Especially in multi-agent scenarios . The existing Multi-Agent Reinforcement learning methods can work well only under the assumption of perfect environment . However , The real world environment is usually noisy . Inaccurate information obtained from noisy environment will hinder the learning of agent , Even lead to training failure . This paper focuses on the problem of training multiple robust agents in noisy environment . In this paper, a new algorithm is proposed , Multi-agent fault-tolerant reinforcement learning (MAFTRL). The main idea of this paper is to establish the error detection mechanism of agent itself , Design the information communication medium between agents . The error detection mechanism is based on automatic encoder , Calculate the reliability of each agent's observation , Effectively reduce environmental noise . Communication media based on attention mechanism can significantly improve the ability of agents to extract effective information . Experimental results show that , The method in this paper accurately detects the error observation of agents , It has good performance and strong robustness in traditional reliable environment and noisy environment . Besides ,MAFTRL It is obviously superior to traditional methods in noisy environment .
边栏推荐
- Three layer switching ②
- Why is the operation unsuccessful (unresolved) uncaught syntaxerror: invalid or unexpected token (resolved)
- Is Shengang securities company as safe as other securities companies
- Small program graduation project based on wechat reservation small program graduation project opening report reference
- 51 single chip microcomputer timer 2 is used as serial port
- Applet graduation design is based on wechat course appointment registration. Applet graduation design opening report function reference
- Luogu p1309 Swiss wheel
- Small program graduation design is based on wechat order takeout small program graduation design opening report function reference
- Hash table, string hash (special KMP)
- Pyrethroid pesticide intermediates - market status and future development trend
猜你喜欢

Solution to the problem that jsp language cannot be recognized in idea
![[leetcode daily question] a single element in an ordered array](/img/3a/2b465589b70cd6aeec08e79fcf40d4.jpg)
[leetcode daily question] a single element in an ordered array

技術實踐|線上故障分析及解决方法(上)

2020-12-02 SSM advanced integration Shang Silicon Valley

Three layer switching ②
![After listening to the system clear message notification, Jerry informed the device side to delete the message [article]](/img/0c/52816b75eb702c7c63966578ab4969.jpg)
After listening to the system clear message notification, Jerry informed the device side to delete the message [article]

From the 18th line to the first line, the new story of the network security industry

In the process of seeking human intelligent AI, meta bet on self supervised learning

Will the memory of ParticleSystem be affected by maxparticles

C import Xls data method summary II (save the uploaded file to the DataTable instance object)
随机推荐
MySQL -- Introduction and use of single line functions
Ceramic metal crowns - current market situation and future development trend
Applet graduation project based on wechat selection voting applet graduation project opening report function reference
How to use AHAS to ensure the stability of Web services?
0 basic learning C language - nixie tube dynamic scanning display
Introduction to superresolution
Jerry's update contact [article]
Pyrethroid pesticide intermediates - market status and future development trend
Trading software programming
All ceramic crowns - current market situation and future development trend
Jerry's watch information type table [chapter]
It's corrected. There's one missing < /script >, why doesn't the following template come out?
AI helps make new breakthroughs in art design plagiarism retrieval! Professor Liu Fang's team paper was employed by ACM mm, a multimedia top-level conference
Day05 table
Some other configurations on Huawei's spanning tree
When the watch system of Jerry's is abnormal, it is used to restore the system [chapter]
Jerry's watch listens to the message notification of the target third-party software and pushes the message to the device [article]
Reading notes - learn to write: what is writing?
[typora installation package] old typera installation package, free version
Hash table, string hash (special KMP)