当前位置:网站首页>ICML 2022:ufrgs | optimistic linear support and subsequent features as the basis for optimal strategy transfer
ICML 2022:ufrgs | optimistic linear support and subsequent features as the basis for optimal strategy transfer
2022-06-27 23:31:00 【Zhiyuan community】
【 title 】Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer
【 The author team 】Lucas N. Alegre, Ana L. C. Bazzan, Bruno C. da Silva
【 Date of publication 】2022.6.22
【 Thesis link 】https://arxiv.org/pdf/2206.11326.pdf
【 Recommended reasons 】 In many real-world applications , Reinforcement learning (RL) An agent may have to solve multiple tasks , Each task is usually modeled by a reward function . If the reward function is linear , And the agent has learned a set of strategies for different tasks , Then you can take advantage of the following features (SFs) To combine these strategies , And find reasonable solutions to new problems . However , The identified solution is not guaranteed to be optimal . This paper introduces a new algorithm to solve this limitation . It allows the RL Agents combine existing strategies , And directly determine the best strategy for any new problem , Without any further interaction with the environment . This paper first proves that the transfer learning problem solved by systemic functional language learners is equivalent to that solved in RL Learning to optimize multi-objective problems . then , In this paper, we introduce an optimistic linear support algorithm based on SF To learn a set of strategies , These strategies are SF Form a convex covering set . Experiments show that this method is superior to the most advanced competitive algorithm in both discrete and continuous fields .
边栏推荐
- Stream + Nacos
- Fsnotify interface of go language to monitor file modification
- 电子科大(申恒涛团队)&京东AI(梅涛团队)提出用于视频问答的结构化双流注意网络,性能SOTA!优于基于双视频表示的方法!
- [network] common request methods
- 在线JSON转PlainText工具
- c语言字符指针、字符串初始化问题
- MSP430F5529 单片机 读取 GY-906 红外温度传感器
- Discuz小鱼游戏风影传说商业GBK+UTF8版模板/DZ游戏网站模板
- 量化交易入门教程
- SQL Server 2016详细安装教程(附注册码和资源)
猜你喜欢
随机推荐
Spark BUG实践(包含的BUG:ClassCastException;ConnectException;NoClassDefFoundError;RuntimeExceptio等。。。。)
企业架构师面试的100个问题
[electron] 基础学习
发射,接收天线方向图
ClickOnce error deploying ClickOnce application - the reference in the manifest does not match the identity of the downloaded assembly
pytorch实现kaggle猫狗识别
Use of go log package log
捷码赋能案例:湖南天辰产研实力迅速提升!实战玩转智慧楼宇/工地等项目
Excel print settings public header
How to set the enterprise wechat group robots to send messages regularly?
使用SQL进行数据去重的N种方法
What problems should be paid attention to in the serpentine wiring of PCB?
刚开始看英文文献,想问一下各位,最初应该怎么看进去?
ICML 2022: UFRGS |作为最优策略转移基础的乐观线性支持和后继特征
华为伙伴暨开发者大会2022 | 麒麟软件携手华为共建计算产业,共创数智未来
居家办公竟比去公司上班还累?
Stream + Nacos
Batch processing - Excel import template 1.1- support multiple sheet pages
vivado VIO IP的用法
[learn FPGA programming from scratch -48]: Vision - development and application of intelligent sensors









