当前位置:网站首页>Yolo format data set processing (XML to txt)
Yolo format data set processing (XML to txt)
2022-07-02 15:24:00 【Shallow thoughts 52】
List of articles
Preface
YOLO The data set of the network is txt Text , When we want to train some models , The data found on the Internet are xml Format , At this time, we need to process the data , Get the data format we want .
One 、 Data processing flow
1. Read xml file , analysis xml Get the width of the picture , high , Coordinate information of calibration frame
2. Data normalization
3. write in txt file
Two 、xml File data format
The picture above , I intercepted it xml Some data of the file , We just need to get size Medium width,height and bndbox Coordinate information in .
3、 ... and 、 Code
import os
import glob
import xml.etree.ElementTree as ET
xml_file=r'E:\ desktop \ Information \cv4\ Data sets \voc Data sets \Annotations'
l=['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog',
'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']
def convert(box,dw,dh):
x=(box[0]+box[2])/2.0
y=(box[1]+box[3])/2.0
w=box[2]-box[0]
h=box[3]-box[1]
x=x/dw
y=y/dh
w=w/dw
h=h/dh
return x,y,w,h
def f(name_id):
xml_o=open(r'E:\ desktop \ Information \cv4\ Data sets \voc Data sets \Annotations\%s.xml'%name_id)
txt_o=open(r'E:\ desktop \ Information \cv4\ Data sets \voc Data sets \labels1\%s.txt'%name_id,'w')
pares=ET.parse(xml_o)
root=pares.getroot()
objects=root.findall('object')
size=root.find('size')
dw=int(size.find('width').text)
dh=int(size.find('height').text)
for obj in objects :
c=l.index(obj.find('name').text)
bnd=obj.find('bndbox')
b=(float(bnd.find('xmin').text),float(bnd.find('ymin').text),
float(bnd.find('xmax').text),float(bnd.find('ymax').text))
x,y,w,h=convert(b,dw,dh)
write_t="{} {:.5f} {:.5f} {:.5f} {:.5f}\n".format(c,x,y,w,h)
txt_o.write(write_t)
xml_o.close()
txt_o.close()
name=glob.glob(os.path.join(xml_file,"*.xml"))
for i in name :
name_id=os.path.basename(i)[:-4]
f(name_id)
summary
That's all xml turn txt The whole content of the document , What problems occur during use , Leave a comment in the comments section .
边栏推荐
- FPGA - clock-03-clock management module (CMT) of internal structure of 7 Series FPGA
- How to conduct TPC-C test on tidb
- 18_Redis_Redis主从复制&&集群搭建
- 如何用 Sysbench 测试 TiDB
- 04.进入云原生后的企业级应用构建的一些思考
- [noi Simulation Competition] scraping (dynamic planning)
- Tidb data migration tool overview
- List set & UML diagram
- Principles, language, compilation, interpretation
- IE 浏览器正式退休
猜你喜欢
随机推荐
Implementation of n queen in C language
表格响应式布局小技巧
TiDB 集群最小部署的拓扑架构
Sharp tool SPL for post SQL calculation
Learn the method code of using PHP to realize the conversion of Gregorian calendar and lunar calendar
HUSTPC2022
17_Redis_Redis发布订阅
07_ Hash
[C language] explain the initial and advanced levels of the pointer and points for attention (1)
08_ 串
Real estate market trend outlook in 2022
Why can't programmers who can only program become excellent developers?
04_ 栈
Tidb environment and system configuration check
牛客练习赛101
06_栈和队列转换
面对“缺芯”挑战,飞凌如何为客户产能提供稳定强大的保障?
08_ strand
Mavn builds nexus private server
Solve the problem that El radio group cannot be edited after echo