当前位置:网站首页>可视化yolov5格式数据集(labelme json文件)
可视化yolov5格式数据集(labelme json文件)
2022-07-03 01:26:00 【athrunsunny】
在自己的项目中,常常会遇到数据集少的情况,但是网上有些标注好的数据,或多或少和自己的项目的标注要求有差别,又不想重新标注,只想微调一下,但是yolov5的原生格式修改起来不直观,这时候可以将yolov5格式的数据转成labelme的json格式,这样就方便对数据的标注进行微调,同时也不用花大心思去标注大数据,减少人工成本。
# -*- coding: utf-8 -*-
"""
Time: 2021.10.26
Author: Athrunsunny
Version: V 0.1
File: yolotolabelme.py
Describe: Functions in this file is change the dataset format to labelme json file
"""
import base64
import io
import os
import numpy as np
import json
from glob import glob
import cv2
import shutil
import yaml
from tqdm import tqdm
import PIL.Image
ROOT_DIR = os.getcwd()
VERSION = '4.5.7' # 根据labelme的版本来修改
def img_arr_to_b64(img_arr):
img_pil = PIL.Image.fromarray(img_arr)
f = io.BytesIO()
img_pil.save(f, format="PNG")
img_bin = f.getvalue()
if hasattr(base64, "encodebytes"):
img_b64 = base64.encodebytes(img_bin)
else:
img_b64 = base64.encodestring(img_bin)
return img_b64
def process_point(points, cls):
info = list()
for point in points:
shape_info = dict()
shape_info['label'] = cls[int(point[0])]
if point is None:
shape_info['points'] = [[], []]
else:
shape_info['points'] = [[point[1], point[2]],
[point[3], point[4]]]
shape_info['group_id'] = None
shape_info['shape_type'] = 'rectangle'
shape_info['flags'] = dict()
info.append(shape_info)
return info
def create_json(img, imagePath, filename, info):
data = dict()
data['version'] = VERSION
data['flags'] = dict()
data['shapes'] = info
data['imagePath'] = imagePath
height, width = img.shape[:2]
data['imageData'] = img_arr_to_b64(img).decode('utf-8')
data['imageHeight'] = height
data['imageWidth'] = width
jsondata = json.dumps(data, indent=4, separators=(',', ': '))
f = open(filename, 'w')
f.write(jsondata)
f.close()
def read_txt(path):
assert os.path.exists(path)
with open(path, mode='r', encoding="utf-8") as f:
content = f.readlines()
content = np.array(content)
res = []
for index, item in enumerate(content):
string = item.split(' ')
res.append(list(map(np.float64, string)))
return np.array(res)
def load_dataset_info(path=ROOT_DIR):
yamlpath = glob(path + "\\*.yaml")[0]
with open(yamlpath, "r", encoding="utf-8") as f:
data = yaml.load(f, Loader=yaml.FullLoader)
return data
def reconvert_list(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = box[0] / dw
w = box[2] / dw
y = box[1] / dh
h = box[3] / dh
x1 = ((x + 1) * 2 - w) / 2.
y1 = ((y + 1) * 2 - h) / 2.
x2 = ((x + 1) * 2 + w) / 2.
y2 = ((y + 1) * 2 + h) / 2.
return x1, y1, x2, y2
def reconvert_np(size, box):
dw = 1. / (size[0])
dh = 1. / (size[1])
x = box[:, :1] / dw
w = box[:, 2:3] / dw
y = box[:, 1:2] / dh
h = box[:, 3:4] / dh
box[:, :1] = ((x + 1) * 2 - w) / 2.
box[:, 2:3] = ((x + 1) * 2 + w) / 2.
box[:, 1:2] = ((y + 1) * 2 - h) / 2.
box[:, 3:4] = ((y + 1) * 2 + h) / 2.
return box
def txt2json(proctype, cls, path=ROOT_DIR):
process_image_path = os.path.join(path, proctype, 'images')
process_label_path = os.path.join(path, proctype, 'labels')
externs = ['png', 'jpg', 'JPEG', 'BMP', 'bmp']
imgfiles = list()
for extern in externs:
imgfiles.extend(glob(process_image_path + "\\*." + extern))
createfile = os.path.join(ROOT_DIR, 'createjson', proctype)
if not os.path.exists(createfile):
os.makedirs(createfile)
for image_path in tqdm(imgfiles):
frame = cv2.imread(image_path)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
height, width = frame.shape[:2]
size = (width, height)
imgfilename = image_path.replace("\\", "/").split("/")[-1]
imgname = '.'.join(imgfilename.split('.')[:-1])
jsonpath = os.path.join(createfile, imgname + '.json')
txtpath = os.path.join(process_label_path, imgname + '.txt')
label_and_point = read_txt(txtpath)
label_and_point[:, 1:] = reconvert_np(size, label_and_point[:, 1:])
info = process_point(label_and_point, cls)
create_json(frame, imgname, jsonpath, info)
shutil.copy(image_path, createfile)
def yolotolabelme(path=ROOT_DIR):
pathtype = list()
if 'train' in os.listdir(path):
pathtype.append('train')
if 'valid' in os.listdir(path):
pathtype.append('valid')
if 'test' in os.listdir(path):
pathtype.append('test')
cls = load_dataset_info()['names']
for file_type in pathtype:
print("Processing image type {} \n".format(file_type))
txt2json(file_type, cls)
if __name__ == "__main__":
yolotolabelme()
将以上代码命名为yolotolabelme.py并存放在数据集的根目录下
在运行程序前先将上面代码中import的几个库安装一下,之后运行
运行之后会在该路径下生成createjson文件夹
转换的数据会根据train或valid生成在createjson文件夹下,之后可通过labelme打开
由于我的test数据集是空的,所以转换后也是空的,使用labelme打开该train路径下的文件可以可以看到对应的标注
边栏推荐
- How to refresh the opening amount of Oracle ERP
- Huakaiyun (Zhiyin) | virtual host: what is a virtual host
- [error record] an error is reported in the fluent interface (no mediaquery widget ancestor found. | scaffold widgets require a mediaquery)
- 小程序開發的部分功能
- [North Asia data recovery] data recovery case of raid crash caused by hard disk disconnection during data synchronization of hot spare disk of RAID5 disk array
- Leetcode skimming questions_ Sum of two numbers II - enter an ordered array
- Steps to obtain SSL certificate private key private key file
- Network security - man in the middle attack
- 传输层 TCP主要特点和TCP连接
- Everything file search tool
猜你喜欢
【Camera专题】OTP数据如何保存在自定义节点中
Everything file search tool
[shutter] animation animation (basic process of shutter animation | create animation controller | create animation | set value listener | set state listener | use animation values in layout | animatio
Pytest learning notes (12) -allure feature · @allure Step () and allure attach
传输层 TCP主要特点和TCP连接
CF1617B Madoka and the Elegant Gift、CF1654C Alice and the Cake、 CF1696C Fishingprince Plays With Arr
STM32 - introduction of external interrupts exti and NVIC
[fluent] hero animation (hero animation use process | create hero animation core components | create source page | create destination page | page Jump)
【数据挖掘】任务3:决策树分类
[QT] encapsulation of custom controls
随机推荐
云原生题目整理(待更新)
Network security - virus
C语言课程信息管理系统
【数据挖掘】任务3:决策树分类
[North Asia data recovery] data recovery case of raid crash caused by hard disk disconnection during data synchronization of hot spare disk of RAID5 disk array
2022-02-15 reading the meta module inspiration of the influxdb cluster
[leetcode] 797 and 1189 (basis of graph theory)
Network security - phishing
Common English Vocabulary
【数据挖掘】任务2:医学数据库MIMIC-III数据处理
网络安全-openvas
STM32 - vibration sensor control relay on
【Camera专题】HAL层-addChannel和startChannel简析
小程序开发黑马购物商城中遇到的问题
Wordinsert formula /endnote
C language course information management system
[camera topic] turn a drive to light up the camera
查询商品案例-页面渲染数据
【Camera专题】手把手撸一份驱动 到 点亮Camera
树形结构数据的处理