当前位置:网站首页>中南大学|通过探索理解: 发现具有深度强化学习的可解释特征
中南大学|通过探索理解: 发现具有深度强化学习的可解释特征
2022-07-03 16:25:00 【智源社区】
【标题】Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning
【作者团队】Jiawen Wei, Zhifeng Qiu, Fangyuan Wang, Wenwei Lin, Ning Gui, Weihua Gui
【发表日期】2022.6.28
【论文链接】https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9810174
【推荐理由】通过交互来理解环境已经成为人类掌握未知系统最重要的智力活动之一。众所周知,深度强化学习 (DRL) 在许多应用中通过类似人类的探索和利用来实现有效控制。然而,深度神经网络(DNN)的不透明特性往往隐藏了与控制相关的关键信息,这对于理解目标系统是必不可少的。本文首先提出了一种新的在线特征选择框架,即基于双世界的注意特征选择(D-AFS) ,以识别输入对整个控制过程的贡献。与大多数 DRL 中使用的世界不同,D-AFS 同时具有现实世界和具有扭曲特性的虚拟世界。新引入的基于注意力的评估(AR)模块实现了从现实世界到虚拟世界的动态映射。现有的 DRL 算法只需稍加修改,就可以在双重世界中学习。通过分析 DRL 在两个世界中的响应,D-AFS 可以定量地识别各个特征对控制的重要性。
边栏推荐
- Record windows10 installation tensorflow-gpu2.4.0
- Explore Cassandra's decentralized distributed architecture
- Remote file contains actual operation
- Q2 encryption market investment and financing report in 2022: gamefi becomes an investment keyword
- Record a jar package conflict resolution process
- Slam learning notes - build a complete gazebo multi machine simulation slam from scratch (III)
- ASEMI整流桥UMB10F参数,UMB10F规格,UMB10F封装
- [combinatorics] combinatorial identities (review of eight combinatorial identities | product of combinatorial identities 1 | proof | use scenario | general method for finding combinatorial numbers)
- Interviewer: how does the JVM allocate and recycle off heap memory
- 8 cool visual charts to quickly write the visual analysis report that the boss likes to see
猜你喜欢

"Remake Apple product UI with Android" (2) -- silky Appstore card transition animation

嵌入式开发:避免开源软件的7个理由

MB10M-ASEMI整流桥MB10M

Mysql 将逗号隔开的属性字段数据由列转行

The mixlab editing team is recruiting teammates~~
![[redis foundation] understand redis master-slave architecture, sentinel mode and cluster together (Demo detailed explanation)](/img/1f/3dd95522b8d5f03dd763a6779e3db5.jpg)
[redis foundation] understand redis master-slave architecture, sentinel mode and cluster together (Demo detailed explanation)

Remote file contains actual operation

Salary 3000, monthly income 40000 by "video editing": people who can make money never rely on hard work!

Cocos Creator 2.x 自动打包(构建 + 编译)

The difference between calling by value and simulating calling by reference
随机推荐
Pandora IOT development board learning (HAL Library) - Experiment 5 external interrupt experiment (learning notes)
ASEMI整流桥UMB10F参数,UMB10F规格,UMB10F封装
MongoDB 的安装和基本操作
Explore Cassandra's decentralized distributed architecture
记一次jar包冲突解决过程
Is it safe to open a stock account by mobile registration? Does it need money to open an account
Multithread 02 thread join
Slam learning notes - build a complete gazebo multi machine simulation slam from scratch (I)
Project -- high concurrency memory pool
六月 致 -.-- -..- -
How can technology managers quickly improve leadership?
Getting started with Message Oriented Middleware
Why does the std:: string operation perform poorly- Why do std::string operations perform poorly?
Qt插件之自定义插件构建和使用
Caching mechanism of Hibernate / session level caching mechanism
Custom plug-in construction and use of QT plug-in
PHP CI(CodeIgniter)log级别设置
Q2 encryption market investment and financing report in 2022: gamefi becomes an investment keyword
From the 18th line to the first line, the new story of the network security industry
Eleven requirements for test management post