当前位置:网站首页>Robot reinforcement learning - transferring end-to-end videomotor control from simulation to realworld (curl 2017)
Robot reinforcement learning - transferring end-to-end videomotor control from simulation to realworld (curl 2017)
2022-06-29 04:48:00 【Qianyu QY】
6.1 brief introduction
Scene oriented : End to end visual drive ( Images -> Torque or torque ,Visuomotor)、 Multistage tasks (multi-stage task).
Mission : Locate the square object 、reach、grasp、 Position the basket 、 Put the cubes in the basket
The method mainly includes two steps :(1) Calculate and collect trajectories in the simulation environment , That is to control the speed ;(2) Training CNN, Learn the mapping from image to speed , Use Domain randomization enhanced .
Presentation data : By solving inverse kinematics in Cartesian coordinate system , first person Images .
The Internet : Input Image and joint angle , Output Motor speed ( adopt PID Make the joint reach this speed ). Auxiliary output The position of the block and the manipulator when , Network performance will be improved .
experiment Demonstrated under dynamic lighting conditions 、 There are interferences 、 Objects move The effect of the scene .
6.2 Method
1、 How to implement multi-stage tasks ? Does the network know the current task progress ?
This is not explicitly stated in the paper , It is presumed that :LSTM Network input four consecutive frames of images , Learn the current task progress according to the image changes ;
Such as :
(1) When the manipulator is empty , The current task goal of the network is Move the manipulator over the block
(2) The manipulator is moving for four consecutive frames , When the manipulator is above the block in the current frame , The current task goal of the network is Closing manipulator
(3) When the block is in the manipulator , The current task goal of the network is Move the manipulator over the basket
(4) When the manipulator moves to the top of the basket , The current task goal of the network is Open the manipulator
6.3 idea
6.3.1 Paper questions
1、 Behavioral cloning cannot handle scenes you have never seen , Therefore need 100 10000 images ; If you use behavioral cloning training first , Then use reinforcement learning , Fewer samples may be required ?
2、 Model too fixed , If I want to grab a circular object 、 Put it in another basket , Need to retrain the network ; You can try to add the task target to the input of the network
6.3.2 idea
1、 Behavioral cloning + Reinforcement learning
2、 Target the task ( Such as object image 、 Map nodes, etc ) Input added to the network , So that the method can deal with different objects
3、 The network input can contain The state of the manipulator , similar QT-Opt;
4、 You can draw lessons from Network output joint speed 、 Domain randomization 、LSTM Learning task status and progress .
6.4 Notes on the original paper
pdf Download address :https://download.csdn.net/download/qq_40081208/85788235







边栏推荐
- Introduction to Bert and Vit
- Hantai oscilloscope software | Hantai oscilloscope upper computer software ns-scope, add measurement data arbitrarily
- 开启生态新姿势 | 使用 WordPress 远程附件存储到 COS
- Practical part: solving the function conflict between swagger and user-defined parameter parser
- C语言用 printf 打印 《爱心》《火星撞地球》等,不断更新
- 泰克DPO4104数字荧光示波器技术参数
- Technical specifications of Tektronix tds3054b oscilloscope
- Network device setting / canceling console port login separate password
- Talking about Canary deployment
- Agilent digital multimeter software ns multimeter, real-time data acquisition and automatic data saving
猜你喜欢

【牛客网刷题系列 之 Verilog快速入门】~ 异步复位的串联T触发器

汉泰示波器软件|汉泰示波器上位机软件NS-Scope,任意添加测量数据

LabVIEW displays Unicode characters

Introduction to Bert and Vit

IDENTITY

波形记录仪MR6000的实时波形运算功能

Observer pattern

Use VS to create a static link library Lib and use

What is the method of connection query in MySQL
![[Verilog quick start of Niuke network question brushing series] ~ asynchronous reset Series T trigger](/img/e3/cf40fb0131ddeb26bc5beeca03d183.png)
[Verilog quick start of Niuke network question brushing series] ~ asynchronous reset Series T trigger
随机推荐
笔记本访问台式机的共享磁盘
Remediation for Unsafe Cryptographic Encryption
EEG signal processing - wavelet transform series
Private project practice sharing gtlab+jenkins architecture construction and document reference
How to use the select statement of MySQL
The subnet of the pool cannot be overlapped with that of other pools.
ECS 4 sync point, write group, version number
Research Report on the overall scale, major manufacturers, major regions, products and application segmentation of GPS antenna modules in the global market in 2022
波形记录仪MR6000的实时波形运算功能
机器人强化学习——Transferring End-to-End Visuomotor Control from Simulation to RealWorld (CoRL 2017)
What are the ways to simulate and burn programs? (including common tools and usage)
Practical part: solving the function conflict between swagger and user-defined parameter parser
Apifox: it is not only an API debugging tool, but also a collaboration artifact of the development team
What is the method of connection query in MySQL
From zero to one, I will teach you to build a "search by text and map" search service (I)
How to change the password of mysql8 created by docker
Mvcc principle in MySQL
JDBC learning
Research Report on the overall scale, major manufacturers, major regions, products and applications of high temperature film capacitors in the global market in 2022
The last week! Summary of pre competition preparation for digital model American Games