当前位置:网站首页>《强化学习周刊》第52期:Depth-CUPRL、DistSPECTRL & Double Deep Q-Network
《强化学习周刊》第52期:Depth-CUPRL、DistSPECTRL & Double Deep Q-Network
2022-07-06 00:33:00 【智源社区】
告诉大家一个好消息,《强化学习周刊》开启“订阅功能”,以后我们会向您自动推送最新版的《强化学习周刊》。订阅方法:
1,注册智源社区账号
2,点击周刊界面左上角的作者栏部分“强化学习周刊”(如下图),进入“强化学习周刊”主页。

3,点击“关注TA”(如下图)

4,您已经完成《强化学习周刊》订阅啦,以后智源社区会自动向您推送最新版的《强化学习周刊》!
论文推荐
。
标题:Deep Reinforcement Learning with Swin Transformer(奥斯陆大学:Li Meng | 基于Swin-Transformer的深度强化学习)
简介:
https://arxiv.org/pdf/2206.15269.pdf
标题:Depth-CUPRL: Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning for Mapless Navigation of Unmanned Aerial Vehicles(FURG : Junior C. de Jesus | Depth-CUPRL:无人机Mapless导航强化学习中的深度图像对比无监督优先表示)
简介:
https://arxiv.org/pdf/2206.15211.pdf
标题:Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning(University of Toronto:Anthony Coache | 深度强化学习的条件可诱导动态风险度量)
简介:
https://arxiv.org/pdf/2206.14666.pdf
标题:Traffic Management of Autonomous Vehicles using Policy Based Deep Reinforcement Learning and Intelligent Routing(巴基斯坦工程与应用科学学院 (PIEAS):Anum Mushtaq | 基于策略的深度强化学习和智能路由的自动驾驶汽车交通管理)
简介:
https://arxiv.org/pdf/2206.14608.pdf
标题:DistSPECTRL: Distributing Specifications in Multi-Agent Reinforcement Learning Systems(普渡大学:Joe Eappen | DistSPECTRL:多智能体强化学习系统中的分发规范)
简介:
https://arxiv.org/pdf/2206.13754.pdf
标题:Applications of Reinforcement Learning in Finance -- Trading with a Double Deep Q-Network(ZHAW:Frensi Zejnullahu | 强化学习在金融交易中的应用——双深度Q网络交易)
简介:
https://arxiv.org/pdf/2206.14267.pdf
标题:An optimization planning framework for allocating multiple distributed energy resources and electric vehicle charging stations in distribution networks(金山大学: Kayode E. Adetunji|配电网多分布式能源和电动汽车充电站优化配置规划框架)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S0306261922008339/pdf
标题:Deep Reinforcement Learning for Personalized Driving Recommendations to Mitigate Aggressiveness and Riskiness: Modeling and Impact Assessment(雅典国家技术大学: Eleni G. Mantouka |用于减轻攻击性和风险的个性化驾驶建议的深度强化学习:建模和影响评估)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22002029/pdf
标题:Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning(中南大学: Jiawen Wei |通过探索理解: 发现具有深度强化学习的可解释特征)
简介:
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9810174
标题:The flying sidekick traveling salesman problem with stochastic travel time: A reinforcement learning approach(田纳西大学: Zeyu Liu |随机旅行时间的无人机与卡车联合运输问题:一种强化学习方法)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S1366554522002034/pdf
标题:Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics(香港理工大学: C. Chen|数据高效的强化学习和网络流量动态的自适应最优周界控制)
简介:现有的数据驱动和反馈流量控制策略没有考虑实时数据测量的异构性。传统的交通控制强化学习(RL)方法缺乏数据效率,收敛缓慢容易受到内生不确定性的影响。本文提出了基于整体强化学习 (IRL) 的方法来学习宏观交通动态,以实现自适应最优周界控制。主要贡献:(a)开发了具有离散增益更新的连续时间控制,以适应离散时间传感器数据。(b) 为了降低采样复杂度并更有效地使用可用数据,将经验重放 (ER) 技术引入 IRL 算法。(c) 所提出的方法以“无模型”的方式放宽了对模型校准的要求。(d) 基于 IRL 的算法的收敛性和受控交通动态的稳定性理论证明。最优控制律被参数化,然后通过神经网络 (NN) 进行逼近,从而降低了计算复杂度。
论文链接:https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22001929/pdf
标题:Clustering Experience Replay for the Effective Exploitation in Reinforcement Learning(电子科技大学: Min Li|强化学习中有效利用的聚类经验回放)
简介:
https://www.sciencedirect.com/science/article/pii/S0031320322003569
标题:Target localization using Multi-Agent Deep Reinforcement Learning with Proximal Policy Optimization(康考迪亚大学: Ahmed Alagha|使用具有近端策略优化的多智能体深度强化学习进行目标定位)
简介:
https://www.sciencedirect.com/science/article/pii/S0167739X22002266
标题:Utility Theory for Sequential Decision Making(麦吉尔大学: Ahmed Alagha| ICML 2022: 顺序决策的效用理论)
简介:
https://arxiv.org/pdf/2206.13637.pdf
标题:Short-Term Plasticity Neurons Learning to Learn and Forget(华为&伦敦大学学院: Hector Garcia Rodriguez| ICML 2022: 短时可塑性神经元学习和遗忘)
简介:
https://arxiv.org/pdf/2206.14048.pdf

边栏推荐
- 建立时间和保持时间的模型分析
- NLP generation model 2017: Why are those in transformer
- Wechat applet -- wxml template syntax (with notes)
- Codeforces gr19 D (think more about why the first-hand value range is 100, JLS yyds)
- Spark DF增加一列
- [QT] QT uses qjson to generate JSON files and save them
- 从底层结构开始学习FPGA----FIFO IP核及其关键参数介绍
- notepad++正则表达式替换字符串
- [Chongqing Guangdong education] reference materials for Zhengzhou Vocational College of finance, taxation and finance to play around the E-era
- Global and Chinese markets of universal milling machines 2022-2028: Research Report on technology, participants, trends, market size and share
猜你喜欢

Data analysis thinking analysis methods and business knowledge - analysis methods (III)

anconda下载+添加清华+tensorflow 安装+No module named ‘tensorflow‘+KernelRestarter: restart failed,内核重启失败

MySQL storage engine

Go learning - dependency injection

MySql——CRUD

数据分析思维分析方法和业务知识——分析方法(三)

FFmpeg学习——核心模块

MDK debug时设置数据实时更新

Uniapp development, packaged as H5 and deployed to the server

Introduction of motor
随机推荐
FFMPEG关键结构体——AVFrame
从底层结构开始学习FPGA----FIFO IP核及其关键参数介绍
LeetCode 斐波那契序列
Codeforces gr19 D (think more about why the first-hand value range is 100, JLS yyds)
N1 # if you work on a metauniverse product [metauniverse · interdisciplinary] Season 2 S2
Yolov5, pychar, Anaconda environment installation
如何解决ecology9.0执行导入流程流程产生的问题
LeetCode 6005. The minimum operand to make an array an alternating array
Notepad + + regular expression replace String
Spark DF adds a column
LeetCode 6004. Get operands of 0
7.5 simulation summary
Location based mobile terminal network video exploration app system documents + foreign language translation and original text + guidance records (8 weeks) + PPT + review + project source code
QT -- thread
小程序技术优势与产业互联网相结合的分析
Global and Chinese markets of POM plastic gears 2022-2028: Research Report on technology, participants, trends, market size and share
LeetCode 8. String conversion integer (ATOI)
LeetCode 6006. Take out the least number of magic beans
Huawei equipment is configured with OSPF and BFD linkage
Leetcode 450 deleting nodes in a binary search tree
