当前位置:网站首页>机器人强化学习——Transferring End-to-End Visuomotor Control from Simulation to RealWorld (CoRL 2017)
机器人强化学习——Transferring End-to-End Visuomotor Control from Simulation to RealWorld (CoRL 2017)
2022-06-29 04:44:00 【千羽QY】
6.1 简介
面向场景:端到端的视觉驱动(图像->力矩或扭矩,Visuomotor)、多阶段任务(multi-stage task)。
任务:定位方块物体、reach、grasp、定位篮子、把方块放到篮子里
方法主要包括两步:(1)在仿真环境中计算并收集轨迹,即控制速度;(2)训练CNN,学习从图像到速度的映射,使用域随机化进行增强。
演示数据:由笛卡尔坐标系中的逆运动学求解计算得到,第一人称图像。
网络:输入图像和关节角,输出电机速度(通过PID使关节达到该速度)。辅助输出方块和机械手的位置时,网络性能会提升。
实验演示了在动态光照条件、有干扰物、物体移动 场景下的效果。
6.2 方法
1、如何实现多阶段任务?网络知道当前任务进度吗?
论文里没有明确说这一点,凭推测是:LSTM网络输入连续四帧图像,根据图像变化学习当前任务进度;
如:
(1)机械手空着时,网络当前任务目标为 把机械手移动至方块上方
(2)机械手连续四帧在运动,且当前帧中机械手位于方块上方时,网络当前任务目标为 闭合机械手
(3)方块位于机械手中时,网络当前任务目标为 把机械手移动至篮子上方
(4)机械手运动至篮子上方时,网络当前任务目标为 张开机械手
6.3 想法
6.3.1 本文问题
1、行为克隆无法处理没见过的场景,因此需要100万张图像;如果先使用行为克隆训练,再使用强化学习,可能需要的样本更少?
2、模型太固定,如果我想抓取一个圆形物体、放到另一个篮子里,需要重新训练网络;可以尝试把任务目标添加进网路的输入
6.3.2 想法
1、行为克隆+强化学习
2、把任务目标(如物体图像、图谱节点等)添加进网路的输入,使方法可以处理不同的物体
3、网络输入可以包含机械手的状态,类似QT-Opt;
4、可以借鉴网络输出关节速度、域随机化方式、LSTM学习任务状态和进度。
6.4 论文原文笔记
pdf下载地址:https://download.csdn.net/download/qq_40081208/85788235







边栏推荐
- JVM memory tuning method
- 2022-2028 global and Chinese industrial digital electronic blasting detonator Market Status and future development trend
- Sword finger offer II 040 Largest rectangle in matrix
- Has my future been considered in the cloud native development route?
- [code random entry - hash table] T15, sum of three numbers - double pointer + sort
- 基于.NetCore开发博客项目 StarBlog - (13) 加入友情链接功能
- How to quickly install MySQL 5.7.17 under CentOS 6.5
- Research Report on global market segmentation based on condition based maintenance (CBM) overall scale, major enterprises, major regions, products and applications in 2022
- gan semi conductor
- 安捷伦数字万用表软件NS-Multimeter,实时数据采集数据自动保存
猜你喜欢

Daily practice - February 15, 2022

软件体系结构实验汇总

Iterator pattern

汉泰示波器软件|汉泰示波器上位机软件NS-Scope,任意添加测量数据
![[Verilog quick start of Niuke network question brushing series] ~ asynchronous reset Series T trigger](/img/e3/cf40fb0131ddeb26bc5beeca03d183.png)
[Verilog quick start of Niuke network question brushing series] ~ asynchronous reset Series T trigger

The subnet of the pool cannot be overlapped with that of other pools.

安捷伦数字万用表软件NS-Multimeter,实时数据采集数据自动保存

EEG signal processing - wavelet transform series

1018 hammer scissors cloth

What are the ways to simulate and burn programs? (including common tools and usage)
随机推荐
安捷伦数字万用表软件NS-Multimeter,实时数据采集数据自动保存
Complete collection of necessary documents for project management: you can't write these 14 project documents yet?
Private project practice sharing gtlab+jenkins architecture construction and document reference
2022-2028 global and Chinese industrial electronic detonator Market Status and future development trend
【代码随想录-动态规划】最长公共子序列
Open source demo| you draw and I guess -- make your life more interesting
Composite pattern
软件体系结构实验汇总
patent filter
Actual combat! Another opening method of magic modified swagger and knife4j
Lua protobuff Emmy Lua wheel
如何用万用表测试电子部件
How to display all MySQL databases
Research Report on the overall scale, major manufacturers, major regions, products and applications of electric hydrofoil surfboards in the global market in 2022
Mvcc principle in MySQL
Research Report on the overall scale, major manufacturers, major regions, products and applications of magnetron sputtering coaters in the global market in 2022
CTO and programmer were both sentenced because the crawler was out of control!
JDBC learning
How to create robots Txt file?
Distributed transaction Seata