当前位置：网站首页>Key Points Estimation and Point Instance

Key Points Estimation and Point Instance

2022-08-01 00:38:00 【Recursi】

Abstract

In the case of traffic line detection,A basic perception module needs to be considered,But many conditions need to be considered,Such as the number of traffic lines and the computing power of the target system.为了解决这些问题,In this paper, a traffic line detection method is proposed,Point instance network(PINet);The method is based on keypoint estimation and instance segmentation methods.PINetIncludes several stacked hourglass networks that are trained simultaneously.因此,可以根据目标环境的计算能力来选择训练模型的大小.我们将预测关键点的聚类问题转换为实例分割问题;

此外,If the user wants to run the trained model on a system with less computing power,such as embedded boards,The network can then be cut and transferred without additional training.

I. INTRODUCTION

The method proposed in this paper uses a stacked hourglass network to predict keypoints on the traffic line.Stacked hourglass networks are often used for pose estimation、Object detection and other keypoint estimation fields.Stacked hourglass networks employ a sequence of downsampling and upsampling,Information at different scales can be extracted.Since the stacked hourglass network contains multiple by the samelossTrained hourglass module,Therefore, various models of different parameter sizes can be obtained simultaneously by culling some elements from the whole structure（Involves knowledge distillation,knowledge distillation）.Distinguish each keypoint into a separate instance.

It has three output branches,And predict the exact location and instance characteristics of points on the traffic line.

contribution: 1. Use keypoint estimation,The proposed method has a more compact prediction output than the semantic segmentation-based lane line detection method.2. The overall network consists of multiple hourglass network modules,Network models of different sizes can be obtained by simple cropping.3. Can be applied to a large number of scenarios,包括不同方向、Any number of traffic lines.4. The proposed method has lower false negatives and noteworthy accuracy,The stability of automatic driving is guaranteed.

B. Key Points Estimation

堆叠的沙漏网络[25]It consists of several hourglass modules that are trained simultaneously.The hourglass module can transmit information at various scales to a deeper level,

Help the entire network to obtain global and local features.

由于这一特性,Hourglass networks are often used to detect the center or corner of objects in the object detection area.由于这一特性,Hourglass networks are often used

to detect the center or corner of objects in the target detection area.

III. METHOD

为了实现这些任务,我们提出的神经网络包括三个输出分支、一个置信分支、偏移分支和嵌入分支.Confidence and offset branches predict precise points for traffic lines;applied byYOLO[45]启发的损失函数.嵌入分支生成每个预测点的嵌入特征;将嵌入

Features are input into the clustering process,以区分每个实例.The loss function of the embedding branch is inspired by the instance segmentation method.

A. Architecture

图2The proposed network framework is shown.输入的RGB图像大小为512×256;It is provided to the resizing network.This image is compressed to a smaller size by adjusting the sequence of convolutional layers in the network（64×32）;Adjust the output of the network into the prediction network.预测网络中可以包含任意数量的沙漏模块;Four hourglass modules were used in this study.All hourglass modules are trained simultaneously by the same loss function.After the training step is over,用户可以根据计算能力选择使用多少个沙漏模块,without any additional training.

Apply three output branches at both ends of each hourglass block;The loss function can be calculated from the output of each hourglass block.Die by cutting out several hourglasses

块,可以调整所需的计算资源.

1) Resizing Network:

The resizing network reduces the size of the input image,以节省内存和推理时间.首先,输入的RGB图像大小为512×256.该网络由三个卷积层组成.All convolutional layers apply filters of size 3×3,步幅为2,填充大小为1.在每个卷积层后使用Prelu[46]和批处理归一化[47].最后,The network generated has64×32Size of the resized output.表IDetails of the constituent layers are displayed.

2) Predicting Network:

This part predicts the precise point and instance segmentation embedding features on the traffic line.

B. Loss Function

对于训练,Four loss functions are applied to each output branch of the hourglass network.The following sections provide details on each loss function.如表二所示,The output branch generates one64个网格,Each cell in the output grid consists of 7个通道的预测值组成,包括置信值（1通道）、偏移量（2通道）值和嵌入特征（4通道）.The confidence value determines whether the key point of the traffic line exists,The offset value will be the confidence value of the predicted keypoint

精确定位,Keypoints are differentiated into individual instances using embedded features.

1) Confifidence Loss:

2) Offset Loss:

3) Embedding Feature Loss:

The training of this branch is used to make grid features with the same instance closer,The features of different instances are farther apart,is a clustering process.

4) Distillation Loss:

总损耗Ltotalis等于上述四个损耗项的加权和

：表IV和图7显示了PINet在CULane数据集上的详细结果.我们在结果中观察到三个特征.首先是PINet在CULane数据集上显示出特别低的误报率.这意味着我们的PINet对车道的错误预测比其他方法要少.这保证了安全性能.第二,裁剪的网络2H和3H表现出与整个网络相似的性能;只有1H的性能较差.在我们建议的体系结构中,当深度为三个沙漏模块时,似乎蒸馏效果最佳.最后,在强光条件下,PINet比其他方法效果更好. CULane数据集中的夜间和炫光类别包括强光条件; PINet在这些类别中显示出更高的性能.但是,由于PINet基于关键点估计方法,因此局部阻塞或车道线路不清晰可能会对性能产生负面影响.

1）TuSimple：对TuSimple数据集的评估需要某些固定y轴值的精确x轴值.评估结果详见表五.图6显示了TuSimple数据集的某些结果.表IV-VI中的值nH表示该网络由n个沙漏模块组成.Although pre-trained weights and amounts are not used

outside datasets,但PINet在准确性和误报率方面也表现出了很高的性能.假阴性率也显示出合理的值.

表VI根据沙漏模块的数量显示了GTX 2080ti GPU的参数数量和fps. PINet的大多数组件都是由瓶颈层构建的.这种架构可以节省大量内存.使用所有沙漏网络时,PINet可以25 fps的速度运行,如果仅应用一个沙漏网络,则该网络的工作速度约为40 fps.在评估短网络时,该网络只是从整个受过训练的网络中裁剪而来,without any other training

训

我们研究了知识蒸馏方法的效果,该知识蒸馏方法的目的是为了减少剪短的网络与充当教师网络的最深网络之间的差距.表VII显示了消融研究的结果.平均性能差距使用以下公式计算：

其中AGn表示4H和nH之间的平均性能差距,N表示此消融研究的训练时期总数,PnH i表示第i个时期的nH性能.性能在tuSimple测试集中进行评估;我们会收集前30个时期的数据.当使用蒸馏方法时,当不使用蒸馏方法时,整个网络和修剪的短网络之间的平均性能差距较小.这意味着蒸馏方法可帮助修剪的短网络很好地模仿教师网络.

原网站

版权声明
本文为[Recursi]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/213/202208010022514590.html