当前位置:网站首页>Necessary for in-depth learning: split the data set, split the labels according to the split pictures, and check the interval of all marked labels
Necessary for in-depth learning: split the data set, split the labels according to the split pictures, and check the interval of all marked labels
2022-07-28 22:15:00 【The wind blows the fallen leaves and the flowers flutter】
Deep learning is necessary : Split the dataset 、 Split according to the split picture labels、 Conduct interval inspection on all marked labels
One 、 Preface
Recently, we are engaged in iFLYTEK car competition , When completing visual target detection , Suddenly used this , So I wrote a little demo Finish the task
Two 、 Source code
function :
- Split the dataset
- Split according to the split picture labels、
- Conduct interval inspection on all marked labels
import os
import random
import shutil
def get_imlist(path):
return [os.path.join(path, f) for f in os.listdir(path) if f.endswith('.jpg')]
def getData(src_path,k):
dest_dir = src_path+'val' # Divided validation set
if not os.path.isdir(dest_dir):
os.mkdir(dest_dir)
img_list = get_imlist(src_path)
random.shuffle(img_list)
le = int(len(img_list) * k) # This can modify the division proportion
for f in img_list[le:]:
shutil.move(f, dest_dir)
''' The functionality : Divide the data set '''
def SplitImg(filePath,k):
# Split dataset
getData(filePath,k)
''' The functionality : Move the annotation file according to the divided data set '''
def MoveAn(filePathAn,filePathImg):
if not os.path.isdir(filePathAn+'val'):
os.mkdir(filePathAn+'val')
Imgs=os.listdir(filePathImg)
for file in os.listdir(filePathAn):
#print(filePathAn,filePathImg)
#print(os.path.join(filePathAn,file),os.path.join(filePathAn+'val',file))
if file[:-4]+'.jpg' in Imgs:
shutil.move(os.path.join(filePathAn,file),os.path.join(filePathAn+'val',file))
''' The functionality : Remove duplicate images '''
def delReDisplayImg(filePath):
Imgs=os.listdir(filePath)
for img in Imgs:
if Imgs.count(img)>1:
os.remove(filePath+'/'+img)
print(filePath+'/'+img)
''' The functionality : Check whether the labels comply with '''
def checkAn(filePath,lim):
Ans=os.listdir(filePath)
dels=[]# Put the illegal files found in this folder
for An in Ans:
with open(os.path.join(filePath,An),'r',encoding='utf-8')as f:
while True:
line=f.readline()
if line=='':
break
#print(An,line)
nc=int(line[0:2])
#print(nc)
#print(nc)
if nc<lim[0] or nc>lim[1]:
print(nc,os.path.join(filePath,An))
dels.append(os.path.join(filePath,An))
#os.remove(os.path.join(filePath,An)
print(' Delete the following exception file ?y/n')
print(dels)
op=input(' Perform the operation :')
if op=='y':
for path in dels:
os.remove(path)
if __name__=='__main__':
filePath = 'D:\AI\yolov5-master\\xunfeidata\images\\train' # Replace it with your data set
# Division ratio
k=0.8
#SplitImg(filePath,k)
filePathAn = 'D:\AI\yolov5-master\\xunfeidata\labels\\train' # Replace it with your marked file address
# Move the annotation file according to the data set
MoveAn(filePathAn,filePath+'val')
checkAn(filePathAn,[0,7])
边栏推荐
猜你喜欢

LVS+KeepAlived高可用部署实战应用

阿里云CDN实践

Ordinary practice of JS DOM programming
![[machine learning] naive Bayesian classification of text -- Classification of people's names and countries](/img/95/1f5b0a17a00da5473180667ccc33e2.png)
[machine learning] naive Bayesian classification of text -- Classification of people's names and countries

Basic introduction of Rockwell AB PLC rslogix digital quantity IO module

System Analyst

Kubevera plug-in addons download address

Have you seen the management area decoupling architecture? Can help customers solve big problems

Apifox: satisfy all your fantasies about API

Record the fluent to solve the problem of a renderflex overflowed by 7.3 pixels on the bottom
随机推荐
No swagger, what do I use?
Have you seen the management area decoupling architecture? Can help customers solve big problems
HCIP(10)
Hcip experiment (14)
Principle of object. Prototype. ToString. Call()
Learn kotlin - extension function
HCIP(14)
What testing services do third-party software testing institutions provide? Charging standard of software test report
HYDAC overflow valve db08a-01-c-n-500v
Add DNS server to LAN for domain name resolution
迪赛智慧数——折线图(堆叠面积图):2022年不同职业人群存款额占月收入比例排名
Basic introduction of Rockwell AB PLC rslogix digital quantity IO module
科大讯飞笔试
AimBetter洞察您的数据库,DPM 和 APM 解决方案
LVS+KeepAlived高可用部署实战应用
Desai wisdom number - line chart (stacking area chart): ranking of deposits of different occupational groups in the proportion of monthly income in 2022
39. Combined sum
腾讯云数据库负责人林晓斌借一亿元炒股?知情人士:金额不实
行内元素和块级元素有什么区别?语义化作用
HCIP(9)