当前位置:网站首页>【数据集显示标注】VOC文件结构+数据集标注可视化+代码实现
【数据集显示标注】VOC文件结构+数据集标注可视化+代码实现
2022-07-27 17:59:00 【LiBiGo】
一、效果图:


显示:代码+常见报错===》正文开始↓
一、Pascal VOC数据集介绍
Pascal VOC网址:http://host.robots.ox.ac.uk/pascal/VOC/
训练/验证数据集下载(2G):host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
数据下载镜像网站(实测迅雷教育网速度很快):https://pjreddie.com/projects/pascal-voc-dataset-mirror/
VOCdevkit文件夹
数据集下载后解压得到一个名为VOCdevkit的文件夹,该文件夹结构如下:
└── VOCdevkit #根目录
└── VOC2012 #不同年份的数据集,这里只下载了2012的,还有2007等其它年份的
├── Annotations #存放xml文件,与JPEGImages中的图片一一对应,解释图片的内容等等
├── ImageSets #该目录下存放的都是txt文件,txt文件中每一行包含一个图片的名称,末尾会加上±1表示正负样本
│ ├── Action
│ ├── Layout
│ ├── Main
│ └── Segmentation
├── JPEGImages #存放源图片
├── SegmentationClass #存放的是图片,语义分割相关
└── SegmentationObject #存放的是图片,实例分割相关
1、JPEGImages
主要提供的是PASCAL VOC所提供的所有的图片信息,包括训练图片,测试图片
这些图像就是用来进行训练和测试验证的图像数据。

2、Annotations

主要存放xml格式的标签文件,每个xml对应JPEGImage中的一张图片
<annotation>
<folder>VOC2012</folder>
<filename>2007_000392.jpg</filename> //文件名
<source> //图像来源(不重要)
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
</source>
<size> //图像尺寸(长宽以及通道数)
<width>500</width>
<height>332</height>
<depth>3</depth>
</size>
<segmented>1</segmented> //是否用于分割(在图像物体识别中01无所谓)
<object> //检测到的物体
<name>horse</name> //物体类别
<pose>Right</pose> //拍摄角度
<truncated>0</truncated> //是否被截断(0表示完整)
<difficult>0</difficult> //目标是否难以识别(0表示容易识别)
<bndbox> //bounding-box(包含左下角和右上角xy坐标)
<xmin>100</xmin>
<ymin>96</ymin>
<xmax>355</xmax>
<ymax>324</ymax>
</bndbox>
</object>
<object> //检测到多个物体
<name>person</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>198</xmin>
<ymin>58</ymin>
<xmax>286</xmax>
<ymax>197</ymax>
</bndbox>
</object>
</annotation>
3、ImageSets
- Action // 人的动作
- Layout // 人体的具体部位
- Main // 图像物体识别的数据,总共20类, 需要保证train val没有交集
- train.txt
- val.txt
- trainval.txt
- Segmentation // 用于分割的数据
4、SegmentationObject【实例分割相关】

5、SegmentationClass【语义分割相关】

二、VOC可视化数据集
1、作用
在做目标检测时,首先要检查标注数据。一方面是要了解标注的情况,另一方面是检查数据集的标注和格式是否正确,只有正确的情况下才能进行下一步的训练。
2、代码实现
import os
# import sys
import cv2
import random
from tqdm import tqdm
# import numpy as np
import argparse
import xml.etree.ElementTree as ET
def xml_reader(filename):
""" Parse a PASCAL VOC xml file """
tree = ET.parse(filename)
objects = []
for obj in tree.findall('object'):
obj_struct = {}
obj_struct['name'] = obj.find('name').text
bbox = obj.find('bndbox')
obj_struct['bbox'] = [int(bbox.find('xmin').text),
int(bbox.find('ymin').text),
int(bbox.find('xmax').text),
int(bbox.find('ymax').text)]
objects.append(obj_struct)
return objects
def get_image_list(image_dir, suffix=['jpg', 'png']):
'''get all image path ends with suffix'''
if not os.path.exists(image_dir):
print("PATH:%s not exists" % image_dir)
return []
imglist = []
for root, sdirs, files in os.walk(image_dir):
if not files:
continue
for filename in files:
filepath = os.path.join(root, filename)
if filename.split('.')[-1] in suffix:
imglist.append(filepath)
return imglist
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='check data')
parser.add_argument('--input', dest='input', help='The input dir of images', type=str)
parser.add_argument('--output', dest='output', default='temp', help='The output dir of images', type=str)
parser.add_argument('--num', dest='num', default=50, help='The number of images you want to check', type=int)
args = parser.parse_args()
if not os.path.exists(args.output):
os.makedirs(args.output)
img_list = get_image_list(args.input)
img_list = random.sample(img_list, args.num)
for img_path in tqdm(img_list):
img = cv2.imread(img_path)
if img is None or not img.any():
continue
xml_path = img_path.replace("JPEGImages", "Annotations").replace(".jpg", ".xml").replace(".png", ".xml")
objects = xml_reader(xml_path)
if len(objects) == 0:
continue
# draw box and name
for obj in objects:
name = obj['name']
box = obj['bbox']
p1 = (box[0], box[1])
p2 = (box[2], box[3])
p3 = (max(box[0], 15), max(box[1], 15))
cv2.rectangle(img, p1, p2, (0, 0, 255), 2)
cv2.putText(img, name, p3, cv2.FONT_ITALIC, 1, (0, 255, 0), 2)
img_name = os.path.basename(img_path)
cv2.imwrite(os.path.join(args.output, img_name), img)3、使用方法
python Visual_dataset.py --input VOCdevkit/JPEGImages --output ./Result_imgs --num 3408
python 上述代码的文件名称 --input 图片地址 --output 输出文件夹地址 --num 图片数量
4、常见报错
(python38) D:\pythontorch\VOC>python Visual_dataset.py --input VOCdevkit/ImageSets --output Result_imgs --num 3408
Traceback (most recent call last):
File "Visual_dataset.py", line 55, in <module>
img_list = random.sample(img_list, args.num)
File "C:\ProgramData\Anaconda3\envs\python38\lib\random.py", line 363, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative原因 你的路径写错了
边栏推荐
- [Alibaba security × ICDM 2022] 200000 bonus pool! The risk commodity inspection competition on the large-scale e-commerce map is in hot registration
- In 2019, the global semiconductor market revenue was $418.3 billion, a year-on-year decrease of 11.9%
- Interviewer: what is the abstract factory model?
- ES6--拓展运算符运用
- Jetpack Compose 性能优化指南——编译指标
- Leetcode:1498. Number of subsequences that meet the conditions [sort + bisection + power hash table]
- js跳转页面并刷新(本页面跳转)
- 盘点下互联网大厂的实习薪资:有了它,你也可以进厂
- 华为150人团队驰援,武汉“小汤山”5G基站火速开通!
- A new UI testing method: visual perception test
猜你喜欢

Office automation solution - docuware cloud is a complete solution to migrate applications and processes to the cloud

Flask-MDict搭建在线Mdict词典服务

Under the epidemic, I left my job for a year, and my income increased 10 times

If you want to switch to software testing, you should pass these three tests first, including a 3000 word super full test learning guide

Knowledge dry goods: basic storage service novice Experience Camp

2022.07.11

Data warehouse construction - DWD floor

Understand the wonderful use of dowanward API, and easily grasp kubernetes environment variables

Illustration leetcode - 592. Fraction addition and subtraction (difficulty: medium)

Simple application of multipoint bidirectional republication and routing strategy
随机推荐
什么是多层感知机(什么是多层感知机)
Common ways to keep requests idempotent
Huawei's mobile phone shipments exceed Apple's, ranking second in the world, but it faces a large amount of inventory that needs to be cleaned up
Pytorch multiplication and broadcasting mechanism
What is a multi-layer perceptron (what is a multi-layer perceptron)
To share the denoising methods and skills of redshift renderer, you must have a look
Conversion of Oracle date
办公自动化解决方案——DocuWare Cloud 将应用程序和流程迁移到云端的完整的解决方案
Technology sharing | how to do Assertion Verification in interface automated testing?
[rctf2015]easysql-1 | SQL injection
JVM overview and memory management (to be continued)
greedy
shell
Oracle simple advanced query
Mongodb learning notes: bson structure analysis
Illustration leetcode - 592. Fraction addition and subtraction (difficulty: medium)
My approval of OA project (Query & meeting signature)
Redis-基本了解,五大基本数据类型
站在巨人肩膀上学习,京东爆款架构师成长手册首发
Graphic leetcode - Sword finger offer II 115. reconstruction sequence (difficulty: medium)