当前位置:网站首页>Necessary for in-depth learning: split the data set, split the labels according to the split pictures, and check the interval of all marked labels
Necessary for in-depth learning: split the data set, split the labels according to the split pictures, and check the interval of all marked labels
2022-07-28 22:15:00 【The wind blows the fallen leaves and the flowers flutter】
Deep learning is necessary : Split the dataset 、 Split according to the split picture labels、 Conduct interval inspection on all marked labels
One 、 Preface
Recently, we are engaged in iFLYTEK car competition , When completing visual target detection , Suddenly used this , So I wrote a little demo Finish the task
Two 、 Source code
function :
- Split the dataset
- Split according to the split picture labels、
- Conduct interval inspection on all marked labels
import os
import random
import shutil
def get_imlist(path):
return [os.path.join(path, f) for f in os.listdir(path) if f.endswith('.jpg')]
def getData(src_path,k):
dest_dir = src_path+'val' # Divided validation set
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
img_list = get_imlist(src_path)
random.shuffle(img_list)
le = int(len(img_list) * k) # This can modify the division proportion
for f in img_list[le:]:
shutil.move(f, dest_dir)
''' The functionality : Divide the data set '''
def SplitImg(filePath,k):
# Split dataset
getData(filePath,k)
''' The functionality : Move the annotation file according to the divided data set '''
def MoveAn(filePathAn,filePathImg):
if not os.path.isdir(filePathAn+'val'):
os.mkdir(filePathAn+'val')
Imgs=os.listdir(filePathImg)
for file in os.listdir(filePathAn):
#print(filePathAn,filePathImg)
#print(os.path.join(filePathAn,file),os.path.join(filePathAn+'val',file))
if file[:-4]+'.jpg' in Imgs:
shutil.move(os.path.join(filePathAn,file),os.path.join(filePathAn+'val',file))
''' The functionality : Remove duplicate images '''
def delReDisplayImg(filePath):
Imgs=os.listdir(filePath)
for img in Imgs:
if Imgs.count(img)>1:
os.remove(filePath+'/'+img)
print(filePath+'/'+img)
''' The functionality : Check whether the labels comply with '''
def checkAn(filePath,lim):
Ans=os.listdir(filePath)
dels=[]# Put the illegal files found in this folder
for An in Ans:
with open(os.path.join(filePath,An),'r',encoding='utf-8')as f:
while True:
line=f.readline()
if line=='':
break
#print(An,line)
nc=int(line[0:2])
#print(nc)
#print(nc)
if nc<lim[0] or nc>lim[1]:
print(nc,os.path.join(filePath,An))
dels.append(os.path.join(filePath,An))
#os.remove(os.path.join(filePath,An)
print(' Delete the following exception file ?y/n')
print(dels)
op=input(' Perform the operation :')
if op=='y':
for path in dels:
os.remove(path)
if __name__=='__main__':
filePath = 'D:\AI\yolov5-master\\xunfeidata\images\\train' # Replace it with your data set
# Division ratio
k=0.8
#SplitImg(filePath,k)
filePathAn = 'D:\AI\yolov5-master\\xunfeidata\labels\\train' # Replace it with your marked file address
# Move the annotation file according to the data set
MoveAn(filePathAn,filePath+'val')
checkAn(filePathAn,[0,7])
边栏推荐
- The binary search boundary value processing based on leetcode35 is used to clarify the boundary value of the judgment condition using the idea of interval
- Introduction to C language [detailed]
- 【机器学习】朴素贝叶斯对文本分类--对人名国别分类
- Hcip experiment (14)
- 静态路由和缺省路由实验
- SQL注入 Less38(堆叠注入)
- [CS231N]Lecture_2:Image Classification pipelin
- Written examination summary record
- No swagger, what do I use?
- array_diff_assoc 元素是数组时不比较数组值的办法
猜你喜欢
40. 组合总和 II
静态路由和缺省路由实验
40. Combined sum II
SQL注入 Less42(POST型堆叠注入)
Have you seen the management area decoupling architecture? Can help customers solve big problems
KubeVela 1.4.x 官方文档
Make trouble fishing day by day
Oracle database objects
Using Baidu easydl to realize chef hat recognition of bright kitchen and stove
90. 子集 II
随机推荐
行内元素和块级元素有什么区别?语义化作用
第三方软件测试机构提供哪些测试服务?软件测试报告收费标准
Data visualization news, different forms of news reports
Future trend of defi in bear market
熊市下 DeFi 的未来趋势
HCIP(8)
罗克韦尔AB PLC RSLogix数字量IO模块基本介绍
Summary of the use of hash table set and map when leetcode brushes questions
HCIP(15)
使用百度EasyDL实现明厨亮灶厨师帽识别
学习 Kotlin - 扩展函数
Bugku,Web:都过滤了
LVS+KeepAlived高可用部署实战应用
Principle of object. Prototype. ToString. Call()
【二叉树】二叉树中的伪回文路径
How many tips do you know about using mock technology to help improve test efficiency?
腾讯云数据库负责人林晓斌借一亿元炒股?知情人士:金额不实
数据可视化新闻,不一样的新闻报道形式
Practice and exploration of overseas site Seata of ant group
Getting started with Oracle