当前位置:网站首页>《强化学习周刊》第52期:Depth-CUPRL、DistSPECTRL & Double Deep Q-Network
《强化学习周刊》第52期:Depth-CUPRL、DistSPECTRL & Double Deep Q-Network
2022-07-06 00:33:00 【智源社区】
告诉大家一个好消息,《强化学习周刊》开启“订阅功能”,以后我们会向您自动推送最新版的《强化学习周刊》。订阅方法:
1,注册智源社区账号
2,点击周刊界面左上角的作者栏部分“强化学习周刊”(如下图),进入“强化学习周刊”主页。
3,点击“关注TA”(如下图)
4,您已经完成《强化学习周刊》订阅啦,以后智源社区会自动向您推送最新版的《强化学习周刊》!
论文推荐
。
标题:Deep Reinforcement Learning with Swin Transformer(奥斯陆大学:Li Meng | 基于Swin-Transformer的深度强化学习)
简介:
https://arxiv.org/pdf/2206.15269.pdf
标题:Depth-CUPRL: Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning for Mapless Navigation of Unmanned Aerial Vehicles(FURG : Junior C. de Jesus | Depth-CUPRL:无人机Mapless导航强化学习中的深度图像对比无监督优先表示)
简介:
https://arxiv.org/pdf/2206.15211.pdf
标题:Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning(University of Toronto:Anthony Coache | 深度强化学习的条件可诱导动态风险度量)
简介:
https://arxiv.org/pdf/2206.14666.pdf
标题:Traffic Management of Autonomous Vehicles using Policy Based Deep Reinforcement Learning and Intelligent Routing(巴基斯坦工程与应用科学学院 (PIEAS):Anum Mushtaq | 基于策略的深度强化学习和智能路由的自动驾驶汽车交通管理)
简介:
https://arxiv.org/pdf/2206.14608.pdf
标题:DistSPECTRL: Distributing Specifications in Multi-Agent Reinforcement Learning Systems(普渡大学:Joe Eappen | DistSPECTRL:多智能体强化学习系统中的分发规范)
简介:
https://arxiv.org/pdf/2206.13754.pdf
标题:Applications of Reinforcement Learning in Finance -- Trading with a Double Deep Q-Network(ZHAW:Frensi Zejnullahu | 强化学习在金融交易中的应用——双深度Q网络交易)
简介:
https://arxiv.org/pdf/2206.14267.pdf
标题:An optimization planning framework for allocating multiple distributed energy resources and electric vehicle charging stations in distribution networks(金山大学: Kayode E. Adetunji|配电网多分布式能源和电动汽车充电站优化配置规划框架)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S0306261922008339/pdf
标题:Deep Reinforcement Learning for Personalized Driving Recommendations to Mitigate Aggressiveness and Riskiness: Modeling and Impact Assessment(雅典国家技术大学: Eleni G. Mantouka |用于减轻攻击性和风险的个性化驾驶建议的深度强化学习:建模和影响评估)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22002029/pdf
标题:Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning(中南大学: Jiawen Wei |通过探索理解: 发现具有深度强化学习的可解释特征)
简介:
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9810174
标题:The flying sidekick traveling salesman problem with stochastic travel time: A reinforcement learning approach(田纳西大学: Zeyu Liu |随机旅行时间的无人机与卡车联合运输问题:一种强化学习方法)
简介:
https://www.sciencedirect.com/sdfe/reader/pii/S1366554522002034/pdf
标题:Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics(香港理工大学: C. Chen|数据高效的强化学习和网络流量动态的自适应最优周界控制)
简介:现有的数据驱动和反馈流量控制策略没有考虑实时数据测量的异构性。传统的交通控制强化学习(RL)方法缺乏数据效率,收敛缓慢
容易受到内生不确定性的影响。本文提出了基于整体强化学习 (IRL) 的方法来学习宏观交通动态,以实现自适应最优周界控制。主要贡献:(a)开发了具有离散增益更新的连续时间控制,以适应离散时间传感器数据。(b) 为了降低采样复杂度并更有效地使用可用数据,将经验重放 (ER) 技术引入 IRL 算法。(c) 所提出的方法以“无模型”的方式放宽了对模型校准的要求 。(d) 基于 IRL 的算法的收敛性和受控交通动态的稳定性 理论证明。最优控制律被参数化,然后通过神经网络 (NN) 进行逼近,从而降低了计算复杂度。论文链接:https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22001929/pdf
标题:Clustering Experience Replay for the Effective Exploitation in Reinforcement Learning(电子科技大学: Min Li|强化学习中有效利用的聚类经验回放)
简介:
https://www.sciencedirect.com/science/article/pii/S0031320322003569
标题:Target localization using Multi-Agent Deep Reinforcement Learning with Proximal Policy Optimization(康考迪亚大学: Ahmed Alagha|使用具有近端策略优化的多智能体深度强化学习进行目标定位)
简介:
https://www.sciencedirect.com/science/article/pii/S0167739X22002266
标题:Utility Theory for Sequential Decision Making(麦吉尔大学: Ahmed Alagha| ICML 2022: 顺序决策的效用理论)
简介:
https://arxiv.org/pdf/2206.13637.pdf
标题:Short-Term Plasticity Neurons Learning to Learn and Forget(华为&伦敦大学学院: Hector Garcia Rodriguez| ICML 2022: 短时可塑性神经元学习和遗忘)
简介:
https://arxiv.org/pdf/2206.14048.pdf
边栏推荐
- 【线上小工具】开发过程中会用到的线上小工具合集
- Reading notes of the beauty of programming
- 如何制作自己的机器人
- 关于slmgr命令的那些事
- Single source shortest path exercise (I)
- Extracting profile data from profile measurement
- FFmpeg学习——核心模块
- Global and Chinese market of valve institutions 2022-2028: Research Report on technology, participants, trends, market size and share
- The relationship between FPGA internal hardware structure and code
- 【DesignMode】装饰者模式(Decorator pattern)
猜你喜欢
随机推荐
The relationship between FPGA internal hardware structure and code
Data analysis thinking analysis methods and business knowledge - analysis methods (III)
Knowledge about the memory size occupied by the structure
OS i/o devices and device controllers
【DesignMode】组合模式(composite mode)
How to solve the problems caused by the import process of ecology9.0
NLP text processing: lemma [English] [put the deformation of various types of words into one form] [wet- > go; are- > be]
Problems and solutions of converting date into specified string in date class
LeetCode 6005. The minimum operand to make an array an alternating array
Gavin teacher's perception of transformer live class - rasa project actual combat e-commerce retail customer service intelligent business dialogue robot system behavior analysis and project summary (4
Global and Chinese market of valve institutions 2022-2028: Research Report on technology, participants, trends, market size and share
LeetCode 斐波那契序列
[Online gadgets] a collection of online gadgets that will be used in the development process
Key structure of ffmpeg -- AVCodecContext
Model analysis of establishment time and holding time
【DesignMode】装饰者模式(Decorator pattern)
Extension and application of timestamp
Hudi of data Lake (2): Hudi compilation
《编程之美》读书笔记
LeetCode 1189. Maximum number of "balloons"