当前位置：网站首页>Pedestrian re identification (Reid) - data set description market-1501

Pedestrian re identification (Reid) - data set description market-1501

2022-07-06 15:08:00 【gmHappy】

Data set profile

　　Market-1501 The data set was collected on the campus of Tsinghua University , Shooting in summer , stay 2015 Built and published in . It consists of 6 A camera （ among 5 HD cameras and 1 A low-definition camera ） It was filmed 1501 A pedestrian 、32668 Pedestrian rectangle detected . Each pedestrian shall be at least 2 Cameras captured , And there may be multiple images in one camera . The training set has 751 people , contain 12,936 Zhang image , On average, everyone has 17.2 Training data ; The test set has 750 people , contain 19,732 Zhang image , On average, everyone has 26.3 Test data .3368 The pedestrian detection rectangle of the query image is drawn manually , and gallery The pedestrian detection rectangle in the uses DPM Detected by the detector . The data set provides a fixed number of training sets and test sets, which can be used in single-shot or multi-shot Use... Under test settings .

Directory structure

Market-1501

　　├── bounding_box_test

　　　　　　　├── 0000_c1s1_000151_01.jpg

　　　　　　　├── 0000_c1s1_000376_03.jpg

　　　　　　　├── 0000_c1s1_001051_02.jpg

　　├── bounding_box_train

　　　　　　　├── 0002_c1s1_000451_03.jpg

　　　　　　　├── 0002_c1s1_000551_01.jpg

　　　　　　　├── 0002_c1s1_000801_01.jpg

　　├── gt_bbox

　　　　　　　├── 0001_c1s1_001051_00.jpg

　　　　　　　├── 0001_c1s1_009376_00.jpg

　　　　　　　├── 0001_c2s1_001976_00.jpg

　　├── gt_query

　　　　　　　├── 0001_c1s1_001051_00_good.mat

　　　　　　　├── 0001_c1s1_001051_00_junk.mat

　　├── query

　　　　　　　├── 0001_c1s1_001051_00.jpg

　　　　　　　├── 0001_c2s1_000301_00.jpg

　　　　　　　├── 0001_c3s1_000551_00.jpg

　　└── readme.txt

catalogue

1） “bounding_box_test”—— For the test set 750 people , contain 19,732 Zhang image , The prefix for 0000 It means extracting this 750 In the process of human being DPM Detect the wrong diagram （ Possible and query It's the same person ）,-1 A diagram showing other people detected （ Not here 750 people ）

2） “bounding_box_train”—— For training sets 751 people , contain 12,936 Zhang image

3） “query”—— by 750 People randomly select an image from each camera as query, So a person's query At most 6 individual , share 3,368 Zhang image

4） “gt_query”——matlab Format , Used to judge a query Which pictures are good matches （ Images from different cameras of the same person ） And a bad match （ An image of the same person, the same camera or an image of a different person ）

5） “gt_bbox”—— Hand marked bounding box, Used to judge DPM Tested bounding box Is it a good box

Naming rules

With 0001_c1s1_000151_01.jpg For example

1） 0001 Indicates the tag number of each person , from 0001 To 1501;

2） c1 Indicates the first camera (camera1), share 6 A camera ;

3） s1 Represents the first video clip (sequece1), Each camera has several video clips ;

4） 000151 Express c1s1 Of the 000151 Frame picture , Video frame rate 25fps;

5） 01 Express c1s1_001051 The... On this frame 1 A detection box , As a result of DPM detector , For pedestrians on each frame, several... May be framed bbox.00 Indicates a manual callout box

Test protocol

Cumulative Matching Characteristics (CMC) curves It is currently the most popular performance evaluation method in the field of pedestrian re recognition . Consider a simple single-gallery-shot situation , In each data set ID(gallery ID) There is only one example . For every recognition (query), The algorithm will be based on the image to be queried (query) To all gallery samples The distance is sorted from small to large ,CMC top-k accuracy The calculation is as follows ：

       
        Acc_k = 1, if top-k ranked gallery samples contain query identity
        
Acc_k = 0, otherwise
       
       
        1.
        2.

This is a shifted step function, The final CMC curve (curve) Through the analysis of all queries Of shifted step functions Take the average to get . Although in single-gallery-shot Under the circumstances ,CMC There is a clear definition , But in multi-gallery-shot Under the circumstances , Its definition is not clear , Because of every gallery identity There could be multiple instances.

Market-1501 in Query and gallery Sets may come from the same camera perspective , But for each query identity, He / She comes from the same camera gallery samples Will be excluded . For each gallery identity, They don't just randomly sample one instance. This means calculating CMC when , query Will always match gallery in “ The most simple ” A positive sample of , Instead of focusing on other positive samples that are more difficult to identify .bounding_box_test The folder is gallery sample ,bounding_box_train The folder is train sample ,query The folder is query sample

You can see that from the top , stay multi-gallery-shot Under the circumstances ,CMC The assessment is flawed . therefore , Also used mAP（mean average precsion） As an evaluation indicator .mAP May be considered as PR The area under the curve , That is, the average precision .

Market-1501 Evaluation Code

Download address

State of the art

State of the art on the Market-1501 dataset

Citation

If you use this dataset, please kindly cite this paper:

       
        @inproceedings{zheng2015scalable,
        
  title={Scalable Person Re-identification: A Benchmark},
        
  author={Zheng, Liang and Shen, Liyue and Tian, Lu and Wang, Shengjin and Wang, Jingdong and Tian, Qi},
        
  booktitle={Computer Vision, IEEE International Conference on},
        
  year={2015}
        
}
       
       
        1.
        2.
        3.
        4.
        5.
        6.

reference

Zheng, Liang, et al. “Scalable Person Re-identification: A Benchmark.” IEEE International Conference on Computer Vision IEEE Computer Society, 2015:1116-1124.
Liang Zheng
Person re-ID

原网站

版权声明
本文为[gmHappy]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202131320217335.html