当前位置:网站首页>对标注文件夹进行清洗
对标注文件夹进行清洗
2022-07-26 08:52:00 【风吹落叶花飘荡】
对标注文件夹进行清洗
一、前言
很多时候使用了Ai标注后,很多类别和我们需要的不一样。

比如有些框没有选定类别,所以标注的时候这个框就是-1,这在训练前是需要清洗的
还有要删除空的标注文件的需求
为此我简单写了一下清洗文件夹内全部txt文件的程序
二、清洗程序
import os
''' 输入文件夹地址: 输出该文件夹下全部文件的相对地址 '''
def getFilePath(fileDo):
allFilePath=[]
for file in os.listdir(fileDo):
allFilePath.append(os.path.join(fileDo, file))
#print(file)
return allFilePath
''' 输入包含文件地址的列表: 函数对文件进行遍历执行以下指令 1、删掉类别为-1的项 2、将其他类别都修改为指定值k 3、删除空的的文件 '''
def PreData(allFilePath,k):
for FilePath in allFilePath:
if FilePath.endswith('txt'):
tem = [] # 将类别不为-1的项保存起来
with open(FilePath, 'r', encoding='utf-8') as f:
while True:
line = f.readline()
if line == '':
break
if line[0] != '-': #完成删除类别为-1的任务
line=str(k)+line[1:]
tem.append(line)
else:
print(line)
#如果,该文本,可正常插入的为空,就删除该文件
if len(tem)==0:
os.remove(FilePath)
print('删除文件'+FilePath)
#部为空则写入
else:
with open(FilePath, 'w', encoding='utf-8') as w:
for line in tem:
w.write(str(line))#将之前缓存的值重写进去
def main():
#fileDo = './人类/人类-百度-标签数据'
fileDo=input('请输入要清洗文件夹的相对地址:')
#print(fileDo)
# 输出该文件夹下,全部文件的相对地址并返回给变量
allFilePath=getFilePath(fileDo)
print('地址内全部文件如下:')
print(allFilePath)
# 对进行文件夹进行规范化
PreData(allFilePath,3)
if __name__ =='__main__':
main()
边栏推荐
- Poor English, Oracle OCP or MySQL OCP exam can also get a high score of 80 points
- Foundry tutorial: writing scalable smart contracts in various ways (Part 1)
- Day06 homework - skill question 6
- [freeswitch development practice] use SIP client Yate to connect freeswitch for VoIP calls
- day06 作业--技能题2
- Pytoch realizes logistic regression
- Ansible important components (playbook)
- Transfer guide printing system based on C language design
- mysql函数
- ES6 modular import and export) (realize page nesting)
猜你喜欢

CSDN Top1 "how does a Virgo procedural ape" become a blogger with millions of fans through writing?

Hegong sky team vision training Day6 - traditional vision, image processing

PXE principles and concepts

pl/sql之集合

正则表达式:判断是否符合USD格式

(1) CTS tradefed test framework environment construction

Set of pl/sql

合工大苍穹战队视觉组培训Day5——机器学习,图像识别项目

Probability model in machine learning

sklearn 机器学习基础(线性回归、欠拟合、过拟合、岭回归、模型加载保存)
随机推荐
Neo eco technology monthly | help developers play smart contracts
Oracle 19C OCP 1z0-082 certification examination question bank (30-35)
Vision Group Training Day5 - machine learning, image recognition project
NFT与数字藏品到底有何区别?
at、crontab
What are the contents of Oracle OCP and MySQL OCP certification exams?
03异常处理,状态保持,请求钩子---04大型项目结构与蓝图
After MySQL 8 OCP (1z0-908), hand in your homework
day06 作业--技能题2
[encryption weekly] has the encryption market recovered? The cold winter still hasn't thawed out. Take stock of the major events that occurred in the encryption market last week
P3743 kotori的设备
Media at home and abroad publicize that we should strictly grasp the content
OA项目之我的会议(会议排座&送审)
Database operation topic 1
Database operation skills 7
Typescript encryption tool passwordencoder
keepalived双机热备
Uploading pictures on Alibaba cloud OSS
Oracle 19C OCP 1z0-082 certification examination question bank (7-12)
Oracle 19C OCP 1z0-082 certification examination question bank (24-29)