当前位置:网站首页>《强化学习周刊》第52期:Depth-CUPRL、DistSPECTRL & Double Deep Q-Network
《强化学习周刊》第52期:Depth-CUPRL、DistSPECTRL & Double Deep Q-Network
2022-07-06 00:33:00 【智源社区】
告诉大家一个好消息,《强化学习周刊》开启“订阅功能”,以后我们会向您自动推送最新版的《强化学习周刊》。订阅方法:
1,注册智源社区账号
2,点击周刊界面左上角的作者栏部分“强化学习周刊”(如下图),进入“强化学习周刊”主页。

3,点击“关注TA”(如下图)
4,您已经完成《强化学习周刊》订阅啦,以后智源社区会自动向您推送最新版的《强化学习周刊》!
论文推荐
。
标题:Deep Reinforcement Learning with Swin Transformer(奥斯陆大学:Li Meng | 基于Swin-Transformer的深度强化学习)
简介:
https://arxiv.org/pdf/2206.15269.pdf
标题:Depth-CUPRL: Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning for Mapless Navigation of Unmanned Aerial Vehicles(FURG : Junior C. de Jesus | Depth-CUPRL:无人机Mapless导航强化学习中的深度图像对比无监督优先表示)
简介:
https://arxiv.org/pdf/2206.15211.pdf
标题:Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning(University of Toronto:Anthony Coache | 深度强化学习的条件可诱导动态风险度量)
简介:
https://arxiv.org/pdf/2206.14666.pdf
标题:Traffic Management of Autonomous Vehicles using Policy Based Deep Reinforcement Learning and Intelligent Routing(巴基斯坦工程与应用科学学院 (PIEAS):Anum Mushtaq | 基于策略的深度强化学习和智能路由的自动驾驶汽车交通管理)
简介:
https://arxiv.org/pdf/2206.14608.pdf
标题:DistSPECTRL: Distributing Specifications in Multi-Agent Reinforcement Learning Systems(普渡大学:Joe Eappen | DistSPECTRL:多智能体强化学习系统中的分发规范)
简介:
https://arxiv.org/pdf/2206.13754.pdf
标题:Applications of Reinforcement Learning in Finance -- Trading with a Double Deep Q-Network(ZHAW:Frensi Zejnullahu | 强化学习在金融交易中的应用——双深度Q网络交易)
简介:
https://arxiv.org/pdf/2206.14267.pdf
标题:An optimization planning framework for allocating multiple distributed energy resources and electric vehicle charging stations in distribution networks(金山大学: Kayode E. Adetunji|配电网多分布式能源和电动汽车充电站优化配置规划框架)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S0306261922008339/pdf
标题:Deep Reinforcement Learning for Personalized Driving Recommendations to Mitigate Aggressiveness and Riskiness: Modeling and Impact Assessment(雅典国家技术大学: Eleni G. Mantouka |用于减轻攻击性和风险的个性化驾驶建议的深度强化学习:建模和影响评估)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22002029/pdf
标题:Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning(中南大学: Jiawen Wei |通过探索理解: 发现具有深度强化学习的可解释特征)
简介:
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9810174
标题:The flying sidekick traveling salesman problem with stochastic travel time: A reinforcement learning approach(田纳西大学: Zeyu Liu |随机旅行时间的无人机与卡车联合运输问题:一种强化学习方法)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S1366554522002034/pdf
标题:Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics(香港理工大学: C. Chen|数据高效的强化学习和网络流量动态的自适应最优周界控制)
简介:现有的数据驱动和反馈流量控制策略没有考虑实时数据测量的异构性。传统的交通控制强化学习(RL)方法缺乏数据效率,收敛缓慢
容易受到内生不确定性的影响。本文提出了基于整体强化学习 (IRL) 的方法来学习宏观交通动态,以实现自适应最优周界控制。主要贡献:(a)开发了具有离散增益更新的连续时间控制,以适应离散时间传感器数据。(b) 为了降低采样复杂度并更有效地使用可用数据,将经验重放 (ER) 技术引入 IRL 算法。(c) 所提出的方法以“无模型”的方式放宽了对模型校准的要求 。(d) 基于 IRL 的算法的收敛性和受控交通动态的稳定性 理论证明。最优控制律被参数化,然后通过神经网络 (NN) 进行逼近,从而降低了计算复杂度。论文链接:https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22001929/pdf
标题:Clustering Experience Replay for the Effective Exploitation in Reinforcement Learning(电子科技大学: Min Li|强化学习中有效利用的聚类经验回放)
简介:
https://www.sciencedirect.com/science/article/pii/S0031320322003569
标题:Target localization using Multi-Agent Deep Reinforcement Learning with Proximal Policy Optimization(康考迪亚大学: Ahmed Alagha|使用具有近端策略优化的多智能体深度强化学习进行目标定位)
简介:
https://www.sciencedirect.com/science/article/pii/S0167739X22002266
标题:Utility Theory for Sequential Decision Making(麦吉尔大学: Ahmed Alagha| ICML 2022: 顺序决策的效用理论)
简介:
https://arxiv.org/pdf/2206.13637.pdf
标题:Short-Term Plasticity Neurons Learning to Learn and Forget(华为&伦敦大学学院: Hector Garcia Rodriguez| ICML 2022: 短时可塑性神经元学习和遗忘)
简介:
https://arxiv.org/pdf/2206.14048.pdf
边栏推荐
- Priority queue (heap)
- Uniapp development, packaged as H5 and deployed to the server
- Shardingsphere source code analysis
- Extracting profile data from profile measurement
- Global and Chinese markets for hinged watertight doors 2022-2028: Research Report on technology, participants, trends, market size and share
- Gd32f4xx UIP protocol stack migration record
- 如何制作自己的机器人
- Determinant learning notes (I)
- 认识提取与显示梅尔谱图的小实验(观察不同y_axis和x_axis的区别)
- MySQL global lock and table lock
猜你喜欢
[designmode] Decorator Pattern
Knowledge about the memory size occupied by the structure
Idea远程提交spark任务到yarn集群
Huawei equipment is configured with OSPF and BFD linkage
Teach you to run uni app with simulator on hbuilderx, conscience teaching!!!
Browser local storage
Opencv classic 100 questions
Common API classes and exception systems
Leetcode:20220213 week race (less bugs, top 10% 555)
从底层结构开始学习FPGA----FIFO IP核及其关键参数介绍
随机推荐
NLP basic task word segmentation third party Library: ICTCLAS [the third party library with the highest accuracy of Chinese word segmentation] [Chinese Academy of Sciences] [charge]
NSSA area where OSPF is configured for Huawei equipment
Room cannot create an SQLite connection to verify the queries
MySQL之函数
Power query data format conversion, Split Merge extraction, delete duplicates, delete errors, transpose and reverse, perspective and reverse perspective
[Chongqing Guangdong education] reference materials for Zhengzhou Vocational College of finance, taxation and finance to play around the E-era
wx. Getlocation (object object) application method, latest version
How to use the flutter framework to develop and run small programs
STM32按键消抖——入门状态机思维
[designmode] Decorator Pattern
Go learning --- structure to map[string]interface{}
电机的简介
2022.7.5-----leetcode. seven hundred and twenty-nine
notepad++正则表达式替换字符串
MySQL存储引擎
Permission problem: source bash_ profile permission denied
PHP determines whether an array contains the value of another array
Idea远程提交spark任务到yarn集群
LeetCode 1598. Folder operation log collector
Comment faire votre propre robot