当前位置:网站首页>mmdetection训练自己的数据集--CVAT标注文件导出coco格式及相关操作
mmdetection训练自己的数据集--CVAT标注文件导出coco格式及相关操作
2022-07-02 06:26:00 【chenf0】
前期配置及遇到的乱七八糟的问题等见:https://blog.csdn.net/chenfang0529/article/details/115094036
一、导出
使用mmdetection训练自己的数据集,数据集使用VCAT进行标注,标注的文件是视频文件,将图像帧及标注文件导出为COCO格式。常用的还有PASCAL VOC
导出后包括两个文件
images和annotations
images中包含图像帧
annotations包含标注文件,我们只需要对第三个文件进行修改。
二、相关代码
1.批量修改图片名
import os
class BatchRename():
def rename(self):
path="D:\\achenf\data\\taxi\\test\\task_2_9_car_test-2021_04_13_13_25_24-coco\images"
filelist=os.listdir(path)
total_num = len(filelist)
i=595
for item in filelist:
if item.endswith('.jpg'):
src=os.path.join(os.path.abspath(path),item)
dst=os.path.join(os.path.abspath(path),''+str(i)+'.jpg') #可根据自己需求选择格式
# dst=os.path.join(os.path.abspath(path),'00000'+format(str(i))+'.jpg') #可根据自己需求选择格式,自定义图片名字
try:
os.rename(src,dst) #src:原名称 dst新名称d
i+=1
except:
continue
print ('total %d to rename & converted %d png'%(total_num,i))
if __name__=='__main__':
demo = BatchRename()
demo.rename()
2.批量修改json文件内容
json中id等需要和图片进行对应。
需要的json中包含五部分,info,categories,licenses,annotations,images
我们只需要修改annotations和images两部分。
import json
import os
path = 'D:\\achenf\data\\taxi\\train\\task_2_8_car_test-2021_04_13_13_25_07-coco\\annotations\\test'
dirs = os.listdir(path)
num_flag = 0
for file in dirs: # 循环读取路径下的文件并筛选输出
if os.path.splitext(file)[1] == ".json": # 筛选csv文件
num_flag = num_flag +1
print("path ===== ",file)
print(os.path.join(path,file))
with open(os.path.join(path,file),'r') as load_f:
load_dict = json.load(load_f)
# print(load_dict)
# n=len(load_dict["image_id"])
# print(type(load_dict))
# for i in load_dict:
# print(i)
for i in load_dict['annotations']:
i['image_id'] = i['image_id'] + 595
i['id']=i['id']+2032
# if i['image_id']>=595:
# i['id']=i['id']+3015
for i in load_dict['images']:
i['id'] = i['id'] + 595
i['file_name'] = ""+str(i['id'])+".jpg"
with open(os.path.join(path,file),'w') as dump_f:
json.dump(load_dict, dump_f)
if(num_flag == 0):
print('所选文件夹不存在json文件,请重新确认要选择的文件夹')
else:
print('共{}个json文件'.format(num_flag))
最后将各个对应的部分进行合并
三、其他
1.解析xml文件,查看文件中标注个数
import os
import xml.dom.minidom
res=0
AnnoPath = r'./file_xml/0512/'
Annolist = os.listdir(AnnoPath)
for annotation in Annolist:
filename =AnnoPath + annotation
dom = xml.dom.minidom.parse(filename) # 打开XML文件
collection = dom.documentElement # 获取元素对象
objectlist = collection.getElementsByTagName('box') # s
count = objectlist.length
res =res+count
print("文件名:", filename,"标注数:", count)
print("一共标注:", res)
结果:
边栏推荐
- MMDetection模型微调
- SSM student achievement information management system
- Open failed: enoent (no such file or directory) / (operation not permitted)
- Using MATLAB to realize: power method, inverse power method (origin displacement)
- Two table Association of pyspark in idea2020 (field names are the same)
- Calculate the difference in days, months, and years between two dates in PHP
- Oracle EBS interface development - quick generation of JSON format data
- Oracle 11.2.0.3 handles the problem of continuous growth of sysaux table space without downtime
- [introduction to information retrieval] Chapter 3 fault tolerant retrieval
- 叮咚,Redis OM对象映射框架来了
猜你喜欢
随机推荐
How to efficiently develop a wechat applet
PointNet原理证明与理解
Calculate the difference in days, months, and years between two dates in PHP
MySQL组合索引加不加ID
使用MAME32K进行联机游戏
Message queue fnd in Oracle EBS_ msg_ pub、fnd_ Application of message in pl/sql
離線數倉和bi開發的實踐和思考
A slide with two tables will help you quickly understand the target detection
SSM学生成绩信息管理系统
Alpha Beta Pruning in Adversarial Search
Spark SQL task performance optimization (basic)
Sparksql data skew
Convert timestamp into milliseconds and format time in PHP
Play online games with mame32k
win10+vs2017+denseflow编译
Data warehouse model fact table model design
聊天中文语料库对比(附上各资源链接)
基于onnxruntime的YOLOv5单张图片检测实现
Pratique et réflexion sur l'entrepôt de données hors ligne et le développement Bi
Practice and thinking of offline data warehouse and Bi development









