当前位置:网站首页>DEX net 2.0 for crawl detection
DEX net 2.0 for crawl detection
2022-07-03 05:16:00 【Qianyu QY】
Preface
Now , Papers on flat crawl detection can be seen at the top of various journals , They claim to be able to cope with multi object stacking scenes , However, the actual effect is not satisfactory , I think the main reasons are as follows :
(1) Lack of grab data sets for multi object stacking scenes . Now the most commonly used Cornell Grasp Dataset, Jacquard Data sets are all single target scenarios .( image Dex-Net Data set and Google Ordinary researchers can only look at things like robot factories )
(2) Now researchers put too much emphasis on end-to-end detection .
(3) There is no suitable grab representation .
How to use a small number of data sets to learn an end-to-end network or a sub network in the whole method , It is urgent to realize the plane grab detection in the stacking scene . The stacking scene mentioned here is the lower left figure , Instead of the right figure mentioned in most papers :

In many robot grasping studies , Because of its simple algorithm 、 There are many researchers with remarkable effects , up to now , The best algorithm is Dex-Net 4.0, This article and the next few blog posts will briefly introduce this series of algorithms .
Just briefly Dex-Net Series algorithm .Dex-Net Including 1.0--4.0 Four versions ,1.0 It is the traditional analytic method , No introduction .2.0 Quality evaluation of parallel plate grab configuration based on deep learning ,3.0 Design for suction cups ,4.0 Combined with the 2.0 and 3.0 Two algorithms , I put the demo video B standing :
Dex-net2.0: https://www.bilibili.com/video/BV1WT4y1M7Ec
Dex-net4.0: https://www.bilibili.com/video/BV1bi4y157P5
1、 Research ideas
Algorithm input is depth map , The output is a plane grab representation , Coordinate point and grasping angle , Then open the parallel plate gripper to the maximum and grasp vertically . The algorithm mainly includes two parts : Sample grab candidates , Capture quality assessment . Sample grab candidates : Sample many candidate grab configurations from a given depth map , Capture quality assessment : Evaluate the quality of each crawl configuration in the previous step [0,1], Then output the highest quality crawl configuration , Here's the picture . among , The traditional method used in the first step , The second step is deep learning , To train this network , author “ Become frenzied ” A containing 670 A data set of 10000 samples . In order to complete the connection of the two stages , The input of neural network cannot be the traditional depth map , But the depth map after careful design and cutting . The best part of the whole algorithm is that it breaks through the traditional grasp idea : End to end crawl detection , Directly predict the optimal crawl configuration .

Dex-Net2.0 The adopted grab is expressed as (x,y,theta), among (x,y) Is the coordinate of the grab point in the depth map ,theta To grasp the direction , When grabbing , Grab the hand to the maximum and then grab vertically , Here's the picture :

The first step is to sample and grab candidates cross entropy method[URL], There is no detailed introduction here , The second stage is mainly introduced below : Capture quality assessment
2、 Capture quality assessment
In order to complete the connection of the two stages , The data input into the neural network cannot be the traditional depth map , Instead, it centers on the grab point , Grab a depth image with horizontal direction fast , Here is a detailed introduction :
2.1 Network input
After the first stage , Got many (x,y,theta), Since it is to evaluate each (x,y,theta) The quality of the , You have to put (x,y,theta) And depth images are used as the input of the network , So how to input ? The scheme given by the author is : To grab points (x,y) Centered , Rotate the depth image theta horn , Make the grabbing direction parallel to the horizontal axis of the image , Then grab points (x,y) Centered , Cut out a piece 32*32 Size depth block , Input this depth map block into the network , besides , Will also grab points (x,y) Height relative to the desktop z As another input ( I feel that I don't need to input z, But there is no verification ). Here's the picture :

2.2 Network output
The output is , Use this crawl configuration to actually crawl , The probability of success ,[0,1].
2.3 Network structure
Pictured above . Please refer to the paper for detailed structure .
2.4 Collection data set
from Dex-Net1.0 Selected in 1500 individual 3D Object model (Dex-Net1.0 It's a 3D Object model and capture data set ), For each of these objects , Do the same : Place objects randomly on the virtual table , Get the vertical grab representation in the current state, that is (x,y,theta)( Can be obtained from Dex-Net1.0 Directly generate ), Some of these crawls can be used for actual crawling , Some cannot be used for crawling . Then use the virtual depth camera to take the depth map , Then execute and... For each crawl representation 2.1 Network input Same operation , Got it. 670 Ten thousand positive and negative samples , Here's the picture :

3、 Actual grab process
Look at the screenshot below ( Too lazy to write. ..):


4、 experimental result
Experimental results on trained objects :

among ,GQ-L-Adv It is one of the several variants of the network given in the paper , Ahead Random、IGQ Wait for the comparison method .
Success Rate Is the success rate from input depth map to capture .
Precision Is the crawl success rate when the optimal crawl configuration is obtained ( It is possible that the network thinks that all sampling and grabbing are not good , Greater than 0.5 Consider it feasible )
Robust Grasp Rate It is the probability that the network thinks the highest quality capture is feasible ( Such as input 100 Zhang image , Only 50 The feasibility of the highest quality grab given in the picture is greater than 0.5, Then the value of this item is 50%).
Experimental results on new objects :

Trained objects Precision The reason why it is lower than untrained objects is , The training objects used in the test are the ones with complex shapes and difficult to grasp .
5、 summary
1、 The main innovation of this paper is :
(1)Dex-Net 2.0 Large data sets , contain 670 Ten thousand samples .
(2) Two stage crawl Detection Algorithm , First sample candidate captures , Then evaluate the grab quality .
2、 All the experiments in this paper are in a single target scenario , Or scenes where objects do not overlap , It can be seen that the effect of multi object stacking scene is poor .
3、 Of new objects Robust Grasp Rate The lower , Only 58%.
4、 Dataset too large , It is not suitable for ordinary researchers .
5、 There is no closed-loop grab , Only suitable for high precision ( It's expensive ) Mechanical arm . Used in the paper ABB Yumi The price is more than one million , I use it now kinova Mechanical arm (10-20 ten thousand ) Even if the detected grab is appropriate , The success rate of crawling is not high (89%).
6、 The grab width is a fixed value , Cannot be used in narrow space ( I think it is Dex-Net The biggest defect of the series ).
边栏推荐
- Handler understands the record
- 5-36v input automatic voltage rise and fall PD fast charging scheme drawing 30W low-cost chip
- 1094 the largest generation (25 points)
- Oracle SQL table data loss
- Webrtc protocol introduction -- an article to understand ice, stun, NAT, turn
- 小学校园IP网络广播-基于校园局域网的小学IP数字广播系统设计
- Chapter II program design of circular structure
- leetcode452. Detonate the balloon with the minimum number of arrows
- appium1.22. Appium inspector after X version needs to be installed separately
- [batch dos-cmd command - summary and summary] - CMD window setting and operation command - close CMD window and exit CMD environment (exit, exit /b, goto: EOF)
猜你喜欢

Go practice -- gorilla / websocket used by gorilla web Toolkit

Without 50W bride price, my girlfriend was forcibly dragged away. What should I do

Go practice -- factory mode of design patterns in golang (simple factory, factory method, abstract factory)

Common interview questions of microservice

Handler understands the record
![[research materials] 2022q1 game preferred casual game distribution circular - Download attached](/img/13/5a67c5d08131745759fdc70a71cf0f.jpg)
[research materials] 2022q1 game preferred casual game distribution circular - Download attached
![[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached](/img/95/833f5ec20207ee5d7e6cdfa7208c5e.jpg)
[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached

Audio Focus Series: write a demo to understand audio focus and audiomananger

Use posture of sudo right raising vulnerability in actual combat (cve-2021-3156)

Detailed explanation of yolov5 training own data set
随机推荐
Redis expiration elimination mechanism
在PyCharm中配置使用Anaconda环境
Unity tool Luban learning notes 1
College campus IP network broadcasting - manufacturer's design guide for college campus IP broadcasting scheme based on campus LAN
Redis 击穿穿透雪崩
[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)
112 stucked keyboard (20 points)
(subplots用法)matplotlib如何绘制多个子图(轴域)
Go practice -- closures in golang (anonymous functions, closures)
Audio Focus Series: write a demo to understand audio focus and audiomananger
Bluebridge cup real topic 2020 palindrome date simulation construction provincial competition
Coordinatorlayout appbarrayout recyclerview item exposure buried point misalignment analysis
Celebrate the new year together
5-36v input automatic voltage rise and fall PD fast charging scheme drawing 30W low-cost chip
JS function algorithm interview case
[basic grammar] Snake game written in C language
Disassembly and installation of Lenovo r7000 graphics card
Go language interface learning notes
Automatic voltage rise and fall 5-40v multi string super capacitor charging chip and solution
Configure and use Anaconda environment in pycharm