当前位置:网站首页>Pedestrian re identification (Reid) - data set description market-1501

Pedestrian re identification (Reid) - data set description market-1501

2022-07-06 15:08:00 gmHappy



Data set profile

  Market-1501 The data set was collected on the campus of Tsinghua University , Shooting in summer , stay 2015 Built and published in . It consists of 6 A camera ( among 5 HD cameras and 1 A low-definition camera ) It was filmed 1501 A pedestrian 、32668 Pedestrian rectangle detected . Each pedestrian shall be at least 2 Cameras captured , And there may be multiple images in one camera . The training set has 751 people , contain 12,936 Zhang image , On average, everyone has 17.2 Training data ; The test set has 750 people , contain 19,732 Zhang image , On average, everyone has 26.3 Test data .3368 The pedestrian detection rectangle of the query image is drawn manually , and gallery The pedestrian detection rectangle in the uses DPM Detected by the detector . The data set provides a fixed number of training sets and test sets, which can be used in single-shot or multi-shot Use... Under test settings .

Directory structure

Market-1501

  ├── bounding_box_test

       ├── 0000_c1s1_000151_01.jpg

       ├── 0000_c1s1_000376_03.jpg

       ├── 0000_c1s1_001051_02.jpg

  ├── bounding_box_train

       ├── 0002_c1s1_000451_03.jpg

       ├── 0002_c1s1_000551_01.jpg

       ├── 0002_c1s1_000801_01.jpg

  ├── gt_bbox

       ├── 0001_c1s1_001051_00.jpg

       ├── 0001_c1s1_009376_00.jpg

       ├── 0001_c2s1_001976_00.jpg

  ├── gt_query

       ├── 0001_c1s1_001051_00_good.mat

       ├── 0001_c1s1_001051_00_junk.mat

  ├── query

       ├── 0001_c1s1_001051_00.jpg

       ├── 0001_c2s1_000301_00.jpg

       ├── 0001_c3s1_000551_00.jpg

  └── readme.txt

catalogue

1) “bounding_box_test”—— For the test set 750 people , contain 19,732 Zhang image , The prefix for 0000 It means extracting this 750 In the process of human being DPM Detect the wrong diagram ( Possible and query It's the same person ),-1 A diagram showing other people detected ( Not here 750 people )

2) “bounding_box_train”—— For training sets 751 people , contain 12,936 Zhang image

3) “query”—— by 750 People randomly select an image from each camera as query, So a person's query At most 6 individual , share 3,368 Zhang image

4) “gt_query”——matlab Format , Used to judge a query Which pictures are good matches ( Images from different cameras of the same person ) And a bad match ( An image of the same person, the same camera or an image of a different person )

5) “gt_bbox”—— Hand marked bounding box, Used to judge DPM Tested bounding box Is it a good box

Naming rules

With 0001_c1s1_000151_01.jpg For example

1) 0001 Indicates the tag number of each person , from 0001 To 1501;

2) c1 Indicates the first camera (camera1), share 6 A camera ;

3) s1 Represents the first video clip (sequece1), Each camera has several video clips ;

4) 000151 Express c1s1 Of the 000151 Frame picture , Video frame rate 25fps;

5) 01 Express c1s1_001051 The... On this frame 1 A detection box , As a result of DPM detector , For pedestrians on each frame, several... May be framed bbox.00 Indicates a manual callout box

Test protocol

Cumulative Matching Characteristics (CMC) curves It is currently the most popular performance evaluation method in the field of pedestrian re recognition . Consider a simple single-gallery-shot situation , In each data set ID(gallery ID) There is only one example . For every recognition (query), The algorithm will be based on the image to be queried (query) To all gallery samples The distance is sorted from small to large ,CMC top-k accuracy The calculation is as follows :

       
Acc_k = 1, if top-k ranked gallery samples contain query identity
Acc_k = 0, otherwise
  • 1.
  • 2.

This is a shifted step function, The final CMC curve (curve) Through the analysis of all queries Of shifted step functions Take the average to get . Although in single-gallery-shot Under the circumstances ,CMC There is a clear definition , But in multi-gallery-shot Under the circumstances , Its definition is not clear , Because of every gallery identity There could be multiple instances.

Market-1501 in Query and gallery Sets may come from the same camera perspective , But for each query identity, He / She comes from the same camera gallery samples Will be excluded . For each gallery identity, They don't just randomly sample one instance. This means calculating CMC when , query Will always match gallery in “ The most simple ” A positive sample of , Instead of focusing on other positive samples that are more difficult to identify .bounding_box_test The folder is gallery sample ,bounding_box_train The folder is train sample ,query The folder is query sample

You can see that from the top , stay multi-gallery-shot Under the circumstances ,CMC The assessment is flawed . therefore , Also used mAP(mean average precsion) As an evaluation indicator .mAP May be considered as PR The area under the curve , That is, the average precision .

Download address


  1.  ​Google Drive​
  2.  ​Baidu Disk​

State of the art

Citation

If you use this dataset, please kindly cite this paper:

       
@inproceedings{zheng2015scalable,
title={Scalable Person Re-identification: A Benchmark},
author={Zheng, Liang and Shen, Liyue and Tian, Lu and Wang, Shengjin and Wang, Jingdong and Tian, Qi},
booktitle={Computer Vision, IEEE International Conference on},
year={2015}
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

reference


  • Zheng, Liang, et al. “Scalable Person Re-identification: A Benchmark.” IEEE International Conference on Computer Vision IEEE Computer Society, 2015:1116-1124.
  •  ​Liang Zheng​
  •  ​Person re-ID​


原网站

版权声明
本文为[gmHappy]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202131320217335.html