当前位置:网站首页>【数据集显示标注】VOC文件结构+数据集标注可视化+代码实现
【数据集显示标注】VOC文件结构+数据集标注可视化+代码实现
2022-07-27 17:59:00 【LiBiGo】
一、效果图:


显示:代码+常见报错===》正文开始↓
一、Pascal VOC数据集介绍
Pascal VOC网址:http://host.robots.ox.ac.uk/pascal/VOC/
训练/验证数据集下载(2G):host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
数据下载镜像网站(实测迅雷教育网速度很快):https://pjreddie.com/projects/pascal-voc-dataset-mirror/
VOCdevkit文件夹
数据集下载后解压得到一个名为VOCdevkit的文件夹,该文件夹结构如下:
└── VOCdevkit #根目录
└── VOC2012 #不同年份的数据集,这里只下载了2012的,还有2007等其它年份的
├── Annotations #存放xml文件,与JPEGImages中的图片一一对应,解释图片的内容等等
├── ImageSets #该目录下存放的都是txt文件,txt文件中每一行包含一个图片的名称,末尾会加上±1表示正负样本
│ ├── Action
│ ├── Layout
│ ├── Main
│ └── Segmentation
├── JPEGImages #存放源图片
├── SegmentationClass #存放的是图片,语义分割相关
└── SegmentationObject #存放的是图片,实例分割相关
1、JPEGImages
主要提供的是PASCAL VOC所提供的所有的图片信息,包括训练图片,测试图片
这些图像就是用来进行训练和测试验证的图像数据。

2、Annotations

主要存放xml格式的标签文件,每个xml对应JPEGImage中的一张图片
<annotation>
<folder>VOC2012</folder>
<filename>2007_000392.jpg</filename> //文件名
<source> //图像来源(不重要)
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
</source>
<size> //图像尺寸(长宽以及通道数)
<width>500</width>
<height>332</height>
<depth>3</depth>
</size>
<segmented>1</segmented> //是否用于分割(在图像物体识别中01无所谓)
<object> //检测到的物体
<name>horse</name> //物体类别
<pose>Right</pose> //拍摄角度
<truncated>0</truncated> //是否被截断(0表示完整)
<difficult>0</difficult> //目标是否难以识别(0表示容易识别)
<bndbox> //bounding-box(包含左下角和右上角xy坐标)
<xmin>100</xmin>
<ymin>96</ymin>
<xmax>355</xmax>
<ymax>324</ymax>
</bndbox>
</object>
<object> //检测到多个物体
<name>person</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>198</xmin>
<ymin>58</ymin>
<xmax>286</xmax>
<ymax>197</ymax>
</bndbox>
</object>
</annotation>
3、ImageSets
- Action // 人的动作
- Layout // 人体的具体部位
- Main // 图像物体识别的数据,总共20类, 需要保证train val没有交集
- train.txt
- val.txt
- trainval.txt
- Segmentation // 用于分割的数据
4、SegmentationObject【实例分割相关】

5、SegmentationClass【语义分割相关】

二、VOC可视化数据集
1、作用
在做目标检测时,首先要检查标注数据。一方面是要了解标注的情况,另一方面是检查数据集的标注和格式是否正确,只有正确的情况下才能进行下一步的训练。
2、代码实现
import os
# import sys
import cv2
import random
from tqdm import tqdm
# import numpy as np
import argparse
import xml.etree.ElementTree as ET
def xml_reader(filename):
""" Parse a PASCAL VOC xml file """
tree = ET.parse(filename)
objects = []
for obj in tree.findall('object'):
obj_struct = {}
obj_struct['name'] = obj.find('name').text
bbox = obj.find('bndbox')
obj_struct['bbox'] = [int(bbox.find('xmin').text),
int(bbox.find('ymin').text),
int(bbox.find('xmax').text),
int(bbox.find('ymax').text)]
objects.append(obj_struct)
return objects
def get_image_list(image_dir, suffix=['jpg', 'png']):
'''get all image path ends with suffix'''
if not os.path.exists(image_dir):
print("PATH:%s not exists" % image_dir)
return []
imglist = []
for root, sdirs, files in os.walk(image_dir):
if not files:
continue
for filename in files:
filepath = os.path.join(root, filename)
if filename.split('.')[-1] in suffix:
imglist.append(filepath)
return imglist
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='check data')
parser.add_argument('--input', dest='input', help='The input dir of images', type=str)
parser.add_argument('--output', dest='output', default='temp', help='The output dir of images', type=str)
parser.add_argument('--num', dest='num', default=50, help='The number of images you want to check', type=int)
args = parser.parse_args()
if not os.path.exists(args.output):
os.makedirs(args.output)
img_list = get_image_list(args.input)
img_list = random.sample(img_list, args.num)
for img_path in tqdm(img_list):
img = cv2.imread(img_path)
if img is None or not img.any():
continue
xml_path = img_path.replace("JPEGImages", "Annotations").replace(".jpg", ".xml").replace(".png", ".xml")
objects = xml_reader(xml_path)
if len(objects) == 0:
continue
# draw box and name
for obj in objects:
name = obj['name']
box = obj['bbox']
p1 = (box[0], box[1])
p2 = (box[2], box[3])
p3 = (max(box[0], 15), max(box[1], 15))
cv2.rectangle(img, p1, p2, (0, 0, 255), 2)
cv2.putText(img, name, p3, cv2.FONT_ITALIC, 1, (0, 255, 0), 2)
img_name = os.path.basename(img_path)
cv2.imwrite(os.path.join(args.output, img_name), img)3、使用方法
python Visual_dataset.py --input VOCdevkit/JPEGImages --output ./Result_imgs --num 3408
python 上述代码的文件名称 --input 图片地址 --output 输出文件夹地址 --num 图片数量
4、常见报错
(python38) D:\pythontorch\VOC>python Visual_dataset.py --input VOCdevkit/ImageSets --output Result_imgs --num 3408
Traceback (most recent call last):
File "Visual_dataset.py", line 55, in <module>
img_list = random.sample(img_list, args.num)
File "C:\ProgramData\Anaconda3\envs\python38\lib\random.py", line 363, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative原因 你的路径写错了
边栏推荐
- 软件测试面试题:统计在一个队列中的数字,有多少个正数,多少个负数,如[1, 3, 5, 7, 0, -1, -9, -4, -5, 8]
- 多点双向重发布及路由策略的简单应用
- ZJNU 22-07-26 比赛心得
- Common ways to keep requests idempotent
- 一个程序员的水平能差到什么程度?
- My approval of OA project (Query & meeting signature)
- 分享Redshift渲染器的去噪方法技巧,一定要看看
- 康佳半导体首款存储主控芯片量产出货,首批10万颗
- 【Map 集合】
- Office automation solution - docuware cloud is a complete solution to migrate applications and processes to the cloud
猜你喜欢

Simple application of multipoint bidirectional republication and routing strategy

Two years after its release, the price increased by $100, and the reverse growth of meta Quest 2

Oracle Xe installation and user operation

PyQt5快速开发与实战 4.7 QSpinBox(计数器) and 4.8 QSlider(滑动条)

站在巨人肩膀上学习,京东爆款架构师成长手册首发

Flask-MDict搭建在线Mdict词典服务

I'm also drunk. Eureka delayed registration and this pit
![[Alibaba security × ICDM 2022] 200000 bonus pool! The risk commodity inspection competition on the large-scale e-commerce map is in hot registration](/img/38/9fadea0d37053a3ebb73806a9963f1.jpg)
[Alibaba security × ICDM 2022] 200000 bonus pool! The risk commodity inspection competition on the large-scale e-commerce map is in hot registration

Add joint control to gltf model

JVM overview and memory management (to be continued)
随机推荐
Pyqt5 rapid development and practice 4.5 button controls and 4.6 qcombobox (drop-down list box)
Redis-基本了解,五大基本数据类型
Oracle simple advanced query
一看就懂的ESLint
Konka sold out its first 100000 storage master chips, with an estimated sales volume of 100million in 2020
Session attack
What are the apps of regular futures trading platforms in 2022, and are they safe?
leetcode:1498. 满足条件的子序列数目【排序 + 二分 + 幂次哈希表】
2022.07.11
JD: get the raw data API of commodity details
[rctf2015]easysql-1 | SQL injection
IE11 下载doc pdf等文件的方法
发布2年后涨价100美元,Meta Quest 2的逆生长
Get wechat product details API
Data warehouse construction - DWD floor
PyQt5快速开发与实战 4.5 按钮类控件 and 4.6 QComboBox(下拉列表框)
In 2019, the global semiconductor market revenue was $418.3 billion, a year-on-year decrease of 11.9%
It is said that Intel will stop the nervana chip manufactured by TSMC at 16nm
[Alibaba security × ICDM 2022] 200000 bonus pool! The risk commodity inspection competition on the large-scale e-commerce map is in hot registration
【效率】弃用 Notepad++,这款开源替代品更牛逼!