当前位置:网站首页>Pedestrian re identification (Reid) - data set description market-1501
Pedestrian re identification (Reid) - data set description market-1501
2022-07-06 15:08:00 【gmHappy】
Data set profile
Market-1501 The data set was collected on the campus of Tsinghua University , Shooting in summer , stay 2015 Built and published in . It consists of 6 A camera ( among 5 HD cameras and 1 A low-definition camera ) It was filmed 1501 A pedestrian 、32668 Pedestrian rectangle detected . Each pedestrian shall be at least 2 Cameras captured , And there may be multiple images in one camera . The training set has 751 people , contain 12,936 Zhang image , On average, everyone has 17.2 Training data ; The test set has 750 people , contain 19,732 Zhang image , On average, everyone has 26.3 Test data .3368 The pedestrian detection rectangle of the query image is drawn manually , and gallery The pedestrian detection rectangle in the uses DPM Detected by the detector . The data set provides a fixed number of training sets and test sets, which can be used in single-shot or multi-shot Use... Under test settings .
Directory structure
Market-1501
├── bounding_box_test
├── 0000_c1s1_000151_01.jpg
├── 0000_c1s1_000376_03.jpg
├── 0000_c1s1_001051_02.jpg
├── bounding_box_train
├── 0002_c1s1_000451_03.jpg
├── 0002_c1s1_000551_01.jpg
├── 0002_c1s1_000801_01.jpg
├── gt_bbox
├── 0001_c1s1_001051_00.jpg
├── 0001_c1s1_009376_00.jpg
├── 0001_c2s1_001976_00.jpg
├── gt_query
├── 0001_c1s1_001051_00_good.mat
├── 0001_c1s1_001051_00_junk.mat
├── query
├── 0001_c1s1_001051_00.jpg
├── 0001_c2s1_000301_00.jpg
├── 0001_c3s1_000551_00.jpg
└── readme.txt
catalogue
1) “bounding_box_test”—— For the test set 750 people , contain 19,732 Zhang image , The prefix for 0000 It means extracting this 750 In the process of human being DPM Detect the wrong diagram ( Possible and query It's the same person ),-1 A diagram showing other people detected ( Not here 750 people )
2) “bounding_box_train”—— For training sets 751 people , contain 12,936 Zhang image
3) “query”—— by 750 People randomly select an image from each camera as query, So a person's query At most 6 individual , share 3,368 Zhang image
4) “gt_query”——matlab Format , Used to judge a query Which pictures are good matches ( Images from different cameras of the same person ) And a bad match ( An image of the same person, the same camera or an image of a different person )
5) “gt_bbox”—— Hand marked bounding box, Used to judge DPM Tested bounding box Is it a good box
Naming rules
With 0001_c1s1_000151_01.jpg For example
1) 0001 Indicates the tag number of each person , from 0001 To 1501;
2) c1 Indicates the first camera (camera1), share 6 A camera ;
3) s1 Represents the first video clip (sequece1), Each camera has several video clips ;
4) 000151 Express c1s1 Of the 000151 Frame picture , Video frame rate 25fps;
5) 01 Express c1s1_001051 The... On this frame 1 A detection box , As a result of DPM detector , For pedestrians on each frame, several... May be framed bbox.00 Indicates a manual callout box
Test protocol
Cumulative Matching Characteristics (CMC) curves It is currently the most popular performance evaluation method in the field of pedestrian re recognition . Consider a simple single-gallery-shot situation , In each data set ID(gallery ID) There is only one example . For every recognition (query), The algorithm will be based on the image to be queried (query) To all gallery samples The distance is sorted from small to large ,CMC top-k accuracy The calculation is as follows :
Acc_k = 1, if top-k ranked gallery samples contain query identity
Acc_k = 0, otherwise
- 1.
- 2.
This is a shifted step function, The final CMC curve (curve) Through the analysis of all queries Of shifted step functions Take the average to get . Although in single-gallery-shot Under the circumstances ,CMC There is a clear definition , But in multi-gallery-shot Under the circumstances , Its definition is not clear , Because of every gallery identity There could be multiple instances.
Market-1501 in Query and gallery Sets may come from the same camera perspective , But for each query identity, He / She comes from the same camera gallery samples Will be excluded . For each gallery identity, They don't just randomly sample one instance. This means calculating CMC when , query Will always match gallery in “ The most simple ” A positive sample of , Instead of focusing on other positive samples that are more difficult to identify .bounding_box_test The folder is gallery sample ,bounding_box_train The folder is train sample ,query The folder is query sample
You can see that from the top , stay multi-gallery-shot Under the circumstances ,CMC The assessment is flawed . therefore , Also used mAP(mean average precsion) As an evaluation indicator .mAP May be considered as PR The area under the curve , That is, the average precision .
Download address
State of the art
Citation
If you use this dataset, please kindly cite this paper:
@inproceedings{zheng2015scalable,
title={Scalable Person Re-identification: A Benchmark},
author={Zheng, Liang and Shen, Liyue and Tian, Lu and Wang, Shengjin and Wang, Jingdong and Tian, Qi},
booktitle={Computer Vision, IEEE International Conference on},
year={2015}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
reference
- Zheng, Liang, et al. “Scalable Person Re-identification: A Benchmark.” IEEE International Conference on Computer Vision IEEE Computer Society, 2015:1116-1124.
- Liang Zheng
- Person re-ID
边栏推荐
- UCORE lab8 file system experiment report
- 基于485总线的评分系统双机实验报告
- 函数:字符串反序存放
- [pointer] use the insertion sorting method to arrange n numbers from small to large
- 想跳槽?面试软件测试需要掌握的7个技能你知道吗
- Pointers: maximum, minimum, and average
- China's county life record: go upstairs to the Internet, go downstairs' code the Great Wall '
- Global and Chinese market for antiviral coatings 2022-2028: Research Report on technology, participants, trends, market size and share
- [pointer] delete all spaces in the string s
- Emqtt distribution cluster and node bridge construction
猜你喜欢
“Hello IC World”
What is the transaction of MySQL? What is dirty reading and what is unreal reading? Not repeatable?
Don't you even look at such a detailed and comprehensive written software test question?
Fundamentals of digital circuits (I) number system and code system
Leetcode simple question: check whether the numbers in the sentence are increasing
数字电路基础(五)算术运算电路
Matplotlib绘图快速入门
Build your own application based on Google's open source tensorflow object detection API video object recognition system (I)
数字电路基础(三)编码器和译码器
Fundamentals of digital circuits (III) encoder and decoder
随机推荐
[Ogg III] daily operation and maintenance: clean up archive logs, register Ogg process services, and regularly back up databases
[HCIA continuous update] advanced features of routing
Fundamentals of digital circuits (I) number system and code system
STC-B学习板蜂鸣器播放音乐2.0
ByteDance ten years of experience, old bird, took more than half a year to sort out the software test interview questions
Function: calculates the number of uppercase letters in a string
Query method of database multi table link
【指针】求字符串的长度
Don't you even look at such a detailed and comprehensive written software test question?
Software testing interview summary - common interview questions
5分钟掌握机器学习鸢尾花逻辑回归分类
基于485总线的评分系统双机实验报告
Differences between select, poll and epoll in i/o multiplexing
[oiclass] share prizes
Login the system in the background, connect the database with JDBC, and do small case exercises
The minimum number of operations to convert strings in leetcode simple problem
5 minutes to master machine learning iris logical regression classification
指针--剔除字符串中的所有数字
CSAPP homework answers chapter 789
Numpy Quick Start Guide