当前位置:网站首页>【目标检测】YOLOv5跑通VOC2007数据集(修复版)
【目标检测】YOLOv5跑通VOC2007数据集(修复版)
2022-07-25 16:40:00 【zstar-_】
前言
在【目标检测】YOLOv5跑通VOC2007数据集一文中,我写了个脚本来提取VOC中Segmentation划分好的数据集,但是经过观察发现,这个train.txt中仅有209条数据,而VOC2007的图片有9963张,这意味着大量的图片被浪费,没有输入到模型中进行训练。
因此,本篇就来重新修改数据集处理的流程,以解决这一问题。
数据集划分
我处理的思路是直接根据图片来进行划分,不用管ImageSets这个文件夹的信息。
在根目录下创建split_data.py
import os
import random
img_path = 'D:/Dataset/VOC2007/images/' # 图片文件夹路径(最后斜杠不要漏)
label_path = 'D:/Dataset/VOC2007/' # 生成划分数据路径
img_list = os.listdir(img_path)
train_ratio = 0.8 # 训练集比例
val_ratio = 0.1 # 验证集比例
shuffle = True # 是否随机划分
def data_split(full_list, train_ratio, val_ratio, shuffle=True):
n_total = len(full_list)
train_set_num = int(n_total * train_ratio)
val_set_num = int(n_total * val_ratio)
if shuffle:
random.shuffle(full_list)
train_set = full_list[:train_set_num]
val_set = full_list[train_set_num:(train_set_num + val_set_num)]
test_set = full_list[(train_set_num + val_set_num):]
return train_set, val_set, test_set
if __name__ == '__main__':
train_set, val_set, test_set = data_split(img_list, train_ratio, val_ratio, shuffle=True)
with open(label_path + 'train.txt', 'w') as f:
for img_name in train_set:
f.write(img_name.split('.jpg')[0] + '\n')
with open(label_path + 'val.txt', 'w') as f:
for img_name in val_set:
f.write(img_name.split('.jpg')[0] + '\n')
with open(label_path + 'test.txt', 'w') as f:
for img_name in test_set:
f.write(img_name.split('.jpg')[0] + '\n')
这里我设置训练集/验证集/测试集的比例为:8:1:1,并且划分时进行可随机打乱,如有需要可以进行修改。
运行之后,在数据集文件夹下生成划分好的数据:

标签转换
和【目标检测】YOLOv5跑通VOC2007数据集文中一样,我们可以依旧采用之前的脚本进行转换,不同的是数据划分集的指向路径发生变化。
在根目录下新建voc2yolo.py文件
import xml.etree.ElementTree as ET
import os
sets = ['train', 'test', 'val']
Imgpath = 'D:/Dataset/VOC2007/images/'
xmlfilepath = 'D:/Dataset/VOC2007/Annotations/'
ImageSets_path = 'D:/Dataset/VOC2007/'
Label_path = 'D:/Dataset/VOC2007/'
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
def convert(size, box):
dw = 1. / size[0]
dh = 1. / size[1]
x = (box[0] + box[1]) / 2.0
y = (box[2] + box[3]) / 2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)
def convert_annotation(image_id):
in_file = open(xmlfilepath + '%s.xml' % (image_id))
out_file = open(Label_path + 'labels/%s.txt' % (image_id), 'w')
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult) == 1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
float(xmlbox.find('ymax').text))
bb = convert((w, h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
for image_set in sets:
if not os.path.exists(Label_path + 'labels/'):
os.makedirs(Label_path + 'labels/')
image_ids = open(ImageSets_path + '%s.txt' % (image_set)).read().strip().split()
list_file = open(Label_path + '%s.txt' % (image_set), 'w')
for image_id in image_ids:
list_file.write(Imgpath + '%s.jpg\n' % (image_id))
convert_annotation(image_id)
list_file.close()
运行之后,数据集文件下生成对应的labels,并且数据集指向发生变化。

训练准备
在data文件夹下新建mydata.yaml
train: D:/Dataset/VOC2007/train.txt
val: D:/Dataset/VOC2007/val.txt
test: D:/Dataset/VOC2007/test.txt
# number of classes
nc: 20
# class names
names: [ 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor' ]
数据的指向修改为自己的路径。
剩下的步骤和前文一样。
边栏推荐
- Cookie、cookie与session区别
- MySQL linked table query, common functions, aggregate functions
- 02. 将参数props限制在一个类型的列表中
- 152. 乘积最大子数组
- Budget report ppt
- Use huggingface to quickly load pre training models and datasets in moment pool cloud
- 谁动了我的内存,揭秘 OOM 崩溃下降 90% 的秘密
- Emqx cloud update: more parameters are added to log analysis, which makes monitoring, operation and maintenance easier
- How to deploy applications on IPFs using 4everland cli
- 论文笔记:Highly accurate protein structure prediction with AlphaFold (AlphaFold 2 & appendix)
猜你喜欢

Understanding service governance in distributed development

Quickly deploy mqtt clusters on AWS using terraform

用递归进行数组求和

微信公众号开发之消息的自动回复

How does win11's own drawing software display the ruler?

百度富文本编辑器UEditor单张图片上传跨域

2W word detailed data Lake: concept, characteristics, architecture and cases

使用 Terraform 在 AWS 上快速部署 MQTT 集群
![[image denoising] image denoising based on bicube interpolation and sparse representation matlab source code](/img/39/716c62d6ca533a7e84704b2c55d072.png)
[image denoising] image denoising based on bicube interpolation and sparse representation matlab source code

Breakthrough in core technology of the large humanoid Service Robot Walker x
随机推荐
复旦大学EMBA同学同行专题:始终将消费者的价值放在最重要的位置
ILSSI认证|六西格玛DMAIC的历程
C# 模拟抽奖
Use huggingface to quickly load pre training models and datasets in moment pool cloud
Paper notes: highly accurate protein structure prediction with alphafold (alphafold 2 & appendix)
02. Limit the parameter props to a list of types
Promise date
Cookie、cookie与session区别
MYSQL导入sqllite表格的两种方法
如何使用 4EVERLAND CLI 在 IPFS 上部署应用程序
2D semantic segmentation -- deeplabv3plus reproduction
easyui修改以及datagrid dialog form控件使用
LVGL 7.11 tileview界面循环切换
进程之间的通信(管道详解)
ReBudget:通过运行时重新分配预算的方法,在基于市场的多核资源分配中权衡效率与公平性
Slf4j and log4j2 process logs
在 NgModule 里通过依赖注入的方式注册服务实例
复旦大学EMBA2022毕业季丨毕业不忘初心 荣耀再上征程
Who moved my memory and revealed the secret of 90% reduction in oom crash
[image hiding] digital image watermarking method technology based on hybrid dwt-hd-svd with matlab code