当前位置:网站首页>中南大学|通过探索理解: 发现具有深度强化学习的可解释特征
中南大学|通过探索理解: 发现具有深度强化学习的可解释特征
2022-07-03 16:25:00 【智源社区】
【标题】Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning
【作者团队】Jiawen Wei, Zhifeng Qiu, Fangyuan Wang, Wenwei Lin, Ning Gui, Weihua Gui
【发表日期】2022.6.28
【论文链接】https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9810174
【推荐理由】通过交互来理解环境已经成为人类掌握未知系统最重要的智力活动之一。众所周知,深度强化学习 (DRL) 在许多应用中通过类似人类的探索和利用来实现有效控制。然而,深度神经网络(DNN)的不透明特性往往隐藏了与控制相关的关键信息,这对于理解目标系统是必不可少的。本文首先提出了一种新的在线特征选择框架,即基于双世界的注意特征选择(D-AFS) ,以识别输入对整个控制过程的贡献。与大多数 DRL 中使用的世界不同,D-AFS 同时具有现实世界和具有扭曲特性的虚拟世界。新引入的基于注意力的评估(AR)模块实现了从现实世界到虚拟世界的动态映射。现有的 DRL 算法只需稍加修改,就可以在双重世界中学习。通过分析 DRL 在两个世界中的响应,D-AFS 可以定量地识别各个特征对控制的重要性。
边栏推荐
- Mb10m-asemi rectifier bridge mb10m
- 为抵制 7-Zip,列出 “三宗罪” ?网友:“第3个才是重点吧?”
- Using optimistic lock and pessimistic lock in MySQL to realize distributed lock
- [list to map] collectors Tomap syntax sharing (case practice)
- ThreeJS 第二篇:顶点概念、几何体结构
- [statement] about searching sogk1997 and finding many web crawler results
- Why can't strings be directly compared with equals; Why can't some integers be directly compared with the equal sign
- 高等数学(第七版)同济大学 习题2-1 个人解答
- Eleven requirements for test management post
- Hibernate的缓存机制/会话级缓存机制
猜你喜欢
拼夕夕二面:说说布隆过滤器与布谷鸟过滤器?应用场景?我懵了。。
Remote file contains actual operation
Multithread 02 thread join
Mysql 单表字段重复数据取最新一条sql语句
[solved] access denied for user 'root' @ 'localhost' (using password: yes)
Explore Netease's large-scale automated testing solutions see here see here
Interviewer: how does the JVM allocate and recycle off heap memory
App mobile terminal test [4] APK operation
【声明】关于检索SogK1997而找到诸多网页爬虫结果这件事
为抵制 7-Zip,列出 “三宗罪” ?网友:“第3个才是重点吧?”
随机推荐
[combinatorics] non descending path problem (outline of non descending path problem | basic model of non descending path problem | non descending path problem expansion model 1 non origin starting poi
相同切入点的抽取
Record windows10 installation tensorflow-gpu2.4.0
0214-27100 a day with little fluctuation
Record a jar package conflict resolution process
2022年Q2加密市场投融资报告:GameFi成为投资关键词
Leetcode binary search tree
拼夕夕二面:说说布隆过滤器与布谷鸟过滤器?应用场景?我懵了。。
无心剑中译泰戈尔《漂鸟集(1~10)》
切入点表达式
Salary 3000, monthly income 40000 by "video editing": people who can make money never rely on hard work!
How to initialize views when loading through storyboards- How is view initialized when loaded via a storyboard?
如何在本机搭建SVN服务器
App mobile terminal test [4] APK operation
LeetCode1491. Average value of wages after removing the minimum wage and the maximum wage
用同花顺炒股开户安全吗?
How to use AAB to APK and APK to AAB of Google play apps on the shelves
用通达信炒股开户安全吗?
[combinatorics] summary of combinatorial identities (eleven combinatorial identities | proof methods of combinatorial identities | summation methods)*
Construction practice camp - graduation summary of phase 6