当前位置:网站首页>Introduction and use of Kitti dataset

Introduction and use of Kitti dataset

2022-07-07 14:32:00 Xiaobai learns vision

Click on the above “ Xiaobai studies vision ”, Optional plus " Star standard " or “ Roof placement

 Heavy dry goods , First time delivery 

Abstract : This article integrates Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite and Vision meets Robotics: The KITTI Dataset The contents of the two papers , This paper mainly introduces KITTI Data set Overview , Data acquisition platform , Detailed description of data set , Evaluation criteria and specific use cases . This paper deals with KITTI Data sets provide a more detailed and comprehensive introduction , Focus on using KITTI Data sets for various studies and experiments .

  1. KITTI Data set Overview

KITTI The dataset was jointly founded by karlsruhr Institute of technology in Germany and Toyota American Institute of technology , It is the largest computer vision algorithm evaluation data set under automatic driving scenario in the world . This dataset is used to evaluate stereoscopic images (stereo), Optical flow (optical flow), Visual ranging (visual odometry),3D Object detection (object detection) and 3D track (tracking) The performance of other computer vision technologies in vehicle environment .KITTI Including downtown 、 Real image data collected from scenes such as villages and highways , Up to... Per image 15 Cars and 30 A pedestrian , There are also various degrees of occlusion and truncation . The whole dataset consists of 389 Stereo image and optical flow diagram ,39.2 km Visual ranging sequence and more than 200k 3D The image composition of the marked object [1] , With 10Hz Frequency sampling and synchronization . On the whole , The original data set is classified as ’Road’,  ’City’,  ’Residential’, ’Campus’ and ’Person’. about 3D Object detection ,label Subdivided into car, van, truck, pedestrian, pedestrian(sitting), cyclist, tram as well as misc form .

6d73a61b05b6cdcc55abd109d164b439.jpeg

  1. Data acquisition platform

    Pictured -1 Shown ,KITTI The data collection platform of data set is equipped with 2 A grayscale camera ,2 A color camera , One Velodyne 64 Line 3D Laser radar ,4 An optical lens , as well as 1 individual GPS Navigation system . The specific sensor parameters are as follows [2] :

• 2 × PointGray Flea2 grayscale cameras (FL2-14S3M-C), 1.4 Megapixels, 1/2” Sony ICX267 CCD, global shutter

• 2 × PointGray Flea2 color cameras (FL2-14S3C-C), 1.4 Megapixels, 1/2” Sony ICX267 CCD, global shutter

• 4 × Edmund Optics lenses, 4mm, opening angle ∼ 90◦, vertical opening angle of region of interest (ROI) ∼ 35◦

• 1 × Velodyne HDL-64E rotating 3D laser scanner, 10 Hz, 64 beams, 0.09 degree angular resolution, 2 cm distance accuracy, collecting ∼ 1.3 million points/second, field of view: 360◦ horizontal, 26.8◦ vertical, range: 120 m

• 1 × OXTS RT3003 inertial and GPS navigation system, 6 axis, 100 Hz, L1/L2 RTK, resolution: 0.02m / 0.1◦

dc7510a9ffdb721d4e2587b6356fe7c8.jpeg

chart -1 Data acquisition platform

Pictured -2 The configuration plan of the sensor is shown . In order to generate binocular stereo images , The same type of camera 54cm install . Because the resolution and contrast of the color camera are not good enough , So two stereo gray cameras are used , It's far from the color camera 6cm install . In order to facilitate the calibration of sensor data , The direction of the specified coordinate system is as follows [2] :

• Camera: x = right,  y = down, z = forward 

• Velodyne: x = forward,  y = left, z = up

• GPS/IMU: x = forward,  y = left, z = up

1d9f67ac0e60db2d6061a3e4d495f10a.jpeg

chart -2 Sensor settings

  1. Dataset detailed

chart -3 It shows KITTI A typical sample of a dataset , It is divided into ’Road’, ’City’, ’Residential’, ’Campus’ and ’Person’ Five category . The original data was collected in 2011 Year of 5 God , share 180GB data .

2e17f6d0a9673b40f5199c0c39524678.jpeg

chart -3 KITTI Samples of data sets , show KITTI Diversity of data sets .

3.1 Data organization form

The paper [2] Data organization form mentioned in , It may be an early version , And current KITTI The official website of the dataset is published in different forms , This article briefly introduces .

Pictured -4 Shown , All sensor data of a video sequence is stored in data_drive Under the folder , among date and drive It's a placeholder , Indicates the date and video number of the collected data . The timestamp is recorded in Timestamps.txt file .

328afdc86cfbcf5990c895877d97c2af.jpeg

chart -4 Data organization form

From KITTI Datasets datasets of each sub task downloaded from the official website , Its document organization form is relatively simple . With Object detection For example , The picture below is Object Detection Evaluation 2012 Standard dataset left color images The directory structure of the file , Samples are stored in testing and training Data sets .

data_object_image_2

|── testing

│   └── image_2

└── training

    └── image_2

The picture below is training Data sets label Folder directory structure .

training/

└── label_2

3.2 Annotations

KITTI The data set provides a 3D Border callout ( Use the coordinate system of lidar ). The annotation of this data set is divided into 8 Categories :’Car’, ’Van’, ’Truck’, ’Pedestrian’, ’Person (sit- ting)’, ’Cyclist’, ’Tram’ and ’Misc’ (e.g., Trailers, Segways). The paper [2] It says 3D Annotation information is stored in date_drive_tracklets.xml, Each object is marked by its category and 3D Size (height,weight and length) form . The annotation of the current dataset is stored in the... Of each task sub dataset label In the folder , not quite the same .

To illustrate KITTI Annotation format of data set , This article takes Object detection Task data set as an example . The data description is in Object development kit Of readme.txt In the document . Link from annotation data training labels of object data set (5 MB) Download data , Unzip the file and enter the directory , Each image corresponds to one .txt file . A frame of image and its corresponding .txt The marking file is shown in the figure -5 Shown .

33d1c00acdc101f031d4144ec325c7db.jpeg

chart -5 object detection Samples and labels

In order to understand the meaning of each field of the annotation file , You need to read the explanatory annotation file readme.txt file . The file is stored in object development kit (1 MB) In file ,readme The sample size of the sub data set is introduced in detail ,label Number of categories , File organization format , annotation format , Evaluation methods and other contents . The following describes the data format label describe :

eff33026b2ba4f95c1ac6e23251510db.jpeg

Be careful ,'DontCare' The label indicates that the area is not marked , For example, because the target object is too far away from the lidar . To prevent during the evaluation process ( Mainly calculation precision), Count the areas that are originally target objects but have not been marked for some reasons as False positive (false positives), The evaluation script is automatically ignored 'DontCare' Regional prediction results . 

3.3 Development Kit

KITTI Each sub data set provides development tools development kit, Mainly by cpp Folder ,matlab Folder ,mapping The folder and readme.txt form . In the figure below object detection Folder for task devkit_object For example , You can see cpp The folder mainly contains the source code of the evaluation model evaluate_object.cpp.Mapping The file in the folder records the mapping from the training set to the original data set , So developers can use LIDAR point cloud at the same time ,gps data , Multi modal data such as color camera data and gray camera image on the right .Matlab The tools in the folder contain read-write tags , draw 2D/3D Marquee , function demo Tools such as .Readme.txt Documents are very important , The data format of a subset is introduced in detail ,benchmark Introduce , Result evaluation method and other details .

devkit_object

|── cpp

│   |── evaluate_object.cpp

│   └── mail.h

|── mapping

│   |── train_mapping.txt

│   └── train_rand.txt

|── matlab

│   |── computeBox3D.m

│   |── computeOrientation3D.m

│   |── drawBox2D.m

│   |── drawBox3D.m

│   |── projectToImage.m

│   |── readCalibration.m

│   |── readLabels.m

│   |── run_demo.m

│   |── run_readWriteDemo.m

│   |── run_statistics.m

│   |── visualization.m

│   └── writeLabels.m

└── readme.txt

  1. Evaluation criteria Evaluation Metrics

4.1 stereo And visual odometry Mission

KITTI Data sets adopt different evaluation criteria for different tasks . For stereoscopic images and optical flow (stereo and optical flow), basis disparity and end-point error Calculate the average number of wrong pixels (average number of erroneous pixels).

For visual ranging and SLAM Mission (visual odometry/SLAM), According to the end of the track (trajectory end-point) The error is evaluated . The traditional method considers both translation and rotation errors ,KITTI Evaluate separately [1] :

0500821055347c4c784fbd4011d26cd4.jpeg

chart -6 Stereo and optical flow Prediction results and evaluation

4.2 3D Object detection and direction prediction

Target detection needs to achieve target location and target recognition at the same time . among , By comparing the prediction frame with ground truth The overlap of borders (Intersection over Union,IoU) And thresholds (e.g. 0.5) The size of determines the correctness of target positioning ; The accuracy of target recognition is determined by comparing the confidence score with the threshold . The above two steps comprehensively determine whether the target detection is correct , Finally, the detection problem of multi category targets is transformed into “ Certain objects are detected correctly 、 Detect errors ” The dichotomous problem of , Thus, the confusion matrix can be constructed , Use a series of indicators of target classification to evaluate the accuracy of the model .

KITTI The data set uses literature [3] The average accuracy used (Average Precision,mAP) Evaluate the results of single class target detection model .PASCAL Visual Object Classes Challenge2007 (VOC2007)[3] Dataset use Precision-Recall Qualitative analysis of the curve , Use average precision(AP) Quantitative analysis model accuracy . The object detection and evaluation standard punishes the missing and wrong detection of objects , At the same time, it is stipulated that repeated and correct detection of the same object is only counted once , Redundant detection is considered an error ( False positive ).

d9917923b65a9efa48ccccd2549652d9.jpeg

about KITTI Target detection task , Only evaluate that the target height is greater than 25pixel Forecast results of , Treat confusing categories as the same to reduce false positives (false positives) rate , And use 41 Equally spaced recall The average of the exact values on approximates the classifier's AP.

For object direction prediction , The literature [1] A novel method is proposed : Average directional similarity ,Average Orientation Similarity (AOS). This indicator is defined as :

8674b398d979bd3627a2783573e156d9.png

among ,r Represents the recall rate of object detection recall. In dependent variable r Next , Directional similarity s∈[0,1] It is defined as all prediction samples and ground truth Normalization of cosine distance :

f8dc9c5d4319538e156a304ee29ef5ba.png

among D(r) It means the recall rate r Set all the predicted positive samples ,∆θ(i) Indicates the detected object i The prediction angle is similar to ground truth Difference . To punish multiple detections for matching to the same ground truth, If i Has been matched to ground truth(IoU At least 50%) Set up δi = 1, otherwise δi = 0.

5. Data usage practice

KITTI The annotation information of the dataset is richer , In actual use, only some fields may be required , Or it needs to be converted to the format of other data sets . For example, you can put KITTI Data sets are converted to PASCAL VOC Format , Thus, it is more convenient to use Faster RCNN perhaps SSD And other advanced detection algorithms . transformation KITTI Datasets need to pay attention to the format of source datasets and target datasets , Reprocessing of category labels , Implementation details are recommended for reference Jesse_Mx[4] and github On manutdzou Open source projects for [5] , These materials introduce the transformation KITTI The data set is PASCAL VOC Format , So as to facilitate training Faster RCNN perhaps SSD Wait for the model .

Reference

  1. Andreas Geiger and Philip Lenz and Raquel Urtasun. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012    

  2. Andreas Geiger and Philip Lenz and Christoph Stiller and Raquel Urtasun. Vision meets Robotics: The KITTI Dataset. IJRR, 2013    

  3. M. Everingham, L.Van Gool, C. K. I.Williams, J.Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2011 (VOC2011) Results.

  4. Jesse_Mx.SD: Single Shot MultiBox Detector Training KITTI Data sets (1).

http://blog.csdn.net/jesse_mx/article/details/65634482

  1. manutdzou.manutdzou/KITTI_SSD.https://github.com/manutdzou/KITTI_SSD

appendix

Fig.7 Frequency of different types of objects in the data set ( Upper figure );

For two main categories ( vehicle , Pedestrians ) Main direction statistical histogram ( The figure below )

e6b0ca5ea52242a9b00cf68dc0948ebd.jpeg

Fig.8 Statistics of the frequency of different types of objects in each figure .

9a1a438ada555b64bfca587cece97ae8.jpeg

Fig.9 Speed , The acceleration ( Eliminate the static state ) Statistical histogram ; Video sequence length statistical histogram ; Every scene (e.g.,  Campus, city) Frame number statistical histogram .

015b932c418b100c99273d473006d637.png

The good news !

Xiaobai learns visual knowledge about the planet

Open to the outside world

c746880dd263a40ea16975b3b94b998d.jpeg

 download 1:OpenCV-Contrib Chinese version of extension module 

 stay 「 Xiaobai studies vision 」 Official account back office reply : Extension module Chinese course , You can download the first copy of the whole network OpenCV Extension module tutorial Chinese version , Cover expansion module installation 、SFM Algorithm 、 Stereo vision 、 Target tracking 、 Biological vision 、 Super resolution processing and other more than 20 chapters .


 download 2:Python Visual combat project 52 speak 
 stay 「 Xiaobai studies vision 」 Official account back office reply :Python Visual combat project , You can download, including image segmentation 、 Mask detection 、 Lane line detection 、 Vehicle count 、 Add Eyeliner 、 License plate recognition 、 Character recognition 、 Emotional tests 、 Text content extraction 、 Face recognition, etc 31 A visual combat project , Help fast school computer vision .


 download 3:OpenCV Actual project 20 speak 
 stay 「 Xiaobai studies vision 」 Official account back office reply :OpenCV Actual project 20 speak , You can download the 20 Based on OpenCV Realization 20 A real project , Realization OpenCV Learn advanced .


 Communication group 

 Welcome to join the official account reader group to communicate with your colleagues , There are SLAM、 3 d visual 、 sensor 、 Autopilot 、 Computational photography 、 testing 、 Division 、 distinguish 、 Medical imaging 、GAN、 Wechat groups such as algorithm competition ( It will be subdivided gradually in the future ), Please scan the following micro signal clustering , remarks :” nickname + School / company + Research direction “, for example :” Zhang San  +  Shanghai Jiaotong University  +  Vision SLAM“. Please note... According to the format , Otherwise, it will not pass . After successful addition, they will be invited to relevant wechat groups according to the research direction . Please do not send ads in the group , Or you'll be invited out , Thanks for your understanding ~
原网站

版权声明
本文为[Xiaobai learns vision]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207071237330495.html