当前位置：网站首页>Training method of grasping angle in grasping detection

Training method of grasping angle in grasping detection

2022-07-03 05:17:00 【Qianyu QY】

I've been looking at the data related to crawl detection during this period , Notice a detail , No matter what kind of grab representation , Both include the grasping rotation angle φ, That is, when the gripper approaches the target vertically , Along the z Angle of axis rotation , The scope is [-π/2,π/2]. In order to train the conversion relationship from the input image to the rotation angle , Generally, the angle value is encoded as the point coordinates on the unit circle (sin2φ, cos2φ) Methods , The purpose is to eliminate the discontinuity of the original angle in the training process . At first, I couldn't understand the meaning of doing this , Later, the source of this method was found according to the references of a certain paper 《Kota Hara, Raviteja Vemulapalli, and Rama Chellappa. Designing Deep Convolutional Neural Networks for Continuous Object Orientation Estimation. arXiv preprint arXiv:1702.01499, 2017》, Summarize the following .

1、 Why not predict directly φ value

At present, the commonly used training model is convolutional neural network , After a series of multiplications, weights and offsets , In the end, I get a value , Calculate the derivative of the loss function with the marked value , Then back propagation updates weights and offsets , Until the loss is minimized . But the angle value is different from the general image classification , The final output of image classification is a set of numbers （ The quantity is the same as the number of labels ）, The subscript where the maximum value is found is the prediction label ; But the final output of angle estimation is a value , It is not based on the relationship between multiple values to get the predicted value （ If the maximum 、 Subscript of minimum value, etc ）, Our angular dimension value is generally [-π/2,π/2], And the predicted value is generally unbounded , And we can't use a unified scale to scale to the range of marked values to calculate the loss , Therefore, it is impossible to predict radian directly .

2、 Thesis method 1 ： Encode the angle value as the point coordinates on the unit circle (sin2φ, cos2φ)

The final output value is not directly used as radian to calculate the loss , But as the coordinates of points on the unit circle (sin2φ, cos2φ)（ The angle estimated in the paper is [-π,π], So it is sinφ and cosφ, Here, because the prediction point may be any point on the unit circle , So use sin2φ and cos2φ）. Corresponding ,groundTruth It is also changed to the corresponding sin2φ and cos2φ. Because in the grab detection , Angle estimation and gripper width 、 Grasping coordinates and so on are trained at the same time , So the loss function is L2 Or cross entropy . So when it's running , The predicted value of the network is sin2φ and cos2φ, We'll pass it again Reverse the arc . In this way , It avoids the problem of directly predicting radian , Because no matter what the predicted value is , He can exit a radian value through the above formula .

however , use L2 Norm or similar functions have a disadvantage in finding losses , The loss is not only related to the angle difference , It is also related to the predicted radius of the circle . It's related to the angle difference. It's easy to understand , How to understand it related to the predicted radius of the circle , This goes back to the previous paragraph , Dimensioned sin2φ and cos2φ Add the squares and the result is 1, That is, all dimension values are the coordinates of points on the unit circle with the center of the unit circle as the origin ; But the predicted result does not necessarily satisfy the sum of squares 1, Even if the radian deduced from the predicted value is the same as the marked value , Also, because of the difference in the sum of squares, the loss will not be 0, As shown in the figure below .

3、 Thesis method 2 ： The loss function is modified in method 1 , The other is constant

The loss function is as follows ：

here , If the angle between the predicted value and the marked value is the same , Even if the radius of the circle is different , Loss is also 0.

But if you want to use two loss functions in the crawl detection network , It's more troublesome to implement , And even if you do , It's hard to say whether to improve accuracy while reducing speed or other advantages , But it's worth trying .

At the same time, the paper also introduces a better method than the first two methods , The general principle is to discretize the continuous angle estimation , Use the network to predict the probability in each discrete interval , Finally, reverse the arc . Because time is limited , I didn't study it carefully , I'll see it later .

原网站

版权声明
本文为[Qianyu QY]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202150623102624.html