当前位置:网站首页>How to make datasets, train them into models and deploy them based on yolov5
How to make datasets, train them into models and deploy them based on yolov5
2022-06-12 04:21:00 【The wind blows the fallen leaves and the flowers flutter】
How to make datasets and base them on yolov5 Train to model
A normal vision AI The development steps are as follows : Collect and organize images 、 Mark the objects you are interested in 、 Training models 、 Deploy it to the cloud / As a port
One 、 Collect pictures
1、 Download existing data
If out of learning , Or it is widely used , It requires high robustness , You can use some
Open data sets 
Zhihu address :https://zhuanlan.zhihu.com/p/25138563
Of course, this is only part of the public data set , You can continue to search .
Other websites that collect data sets
1.datafountain
https://www.datafountain.cn/datasets
2. Aggregation force
http://dataju.cn/Dataju/web/searchDataset
3. chinese NLP Dataset search
https://www.cluebenchmarks.com/dataSet_search.html
4. Alibaba cloud Tianchi
https://tianchi.aliyun.com/dataset/?spm=5176.12282016.J_9711814210.24.2c656d92n0Us6s
5. Google datasets seem to be going over the wall
2、 Use your own shot / Pictures collected on the website
You don't need any operation to take your own pictures , Just use it directly
The following is the code for downloading pictures using crawlers
import os
import sys
import time
import urllib
import requests
import re
from bs4 import BeautifulSoup
import time
header = {
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 UBrowser/6.1.2107.204 Safari/537.36'
}
url = "https://cn.bing.com/images/async?q={0}&first={1}&count={2}&scenario=ImageBasicHover&datsrc=N_I&layout=ColumnBased&mmasync=1&dgState=c*9_y*2226s2180s2072s2043s2292s2295s2079s2203s2094_i*71_w*198&IG=0D6AD6CBAF43430EA716510A4754C951&SFX={3}&iid=images.5599"
def getImage(url, count):
''' From the original url Save the original image to local '''
try:
time.sleep(0.5)
urllib.request.urlretrieve(url, './imgs/hat' + str(count + 1) + '.jpg')
except Exception as e:
time.sleep(1)
print(" Abnormal acquisition of this picture , skip ...")
else:
print(" picture +1, Saved successfully " + str(count + 1) + " Pictures ")
def findImgUrlFromHtml(html, rule, url, key, first, loadNum, sfx, count):
''' From the thumbnail list page, find the url, And return the number of pictures on this page '''
soup = BeautifulSoup(html, "lxml")
link_list = soup.find_all("a", class_="iusc")
url = []
for link in link_list:
result = re.search(rule, str(link))
# The string "amp;" Delete
url = result.group(0)
# Complete assembly url
url = url[8:len(url)]
# Open the HD picture website
getImage(url, count)
count += 1
# Finish one page , Continue loading the next page
return count
def getStartHtml(url, key, first, loadNum, sfx):
''' Get thumbnail list page '''
page = urllib.request.Request(url.format(key, first, loadNum, sfx),
headers=header)
html = urllib.request.urlopen(page)
return html
if __name__ == '__main__':
name = " wear one's hat; put on one's hat; pin the label on sb " # Picture keywords
path = './imgs/hat' # Image saving path
countNum = 2000 # Crawling quantity
key = urllib.parse.quote(name)
first = 1
loadNum = 35
sfx = 1
count = 0
rule = re.compile(r"\"murl\"\:\"http\S[^\"]+")
if not os.path.exists(path):
os.makedirs(path)
while count < countNum:
html = getStartHtml(url, key, first, loadNum, sfx)
count = findImgUrlFromHtml(html, rule, url, key, first, loadNum, sfx,
count)
first = count + 1
sfx += 1
Two 、 Mark the picture
1、 Online tagging websites MAKE SENSE
MAKE SENSE
make-sense Is a YOLOv5 Officially recommended image annotation tool .
Compared to other tools ,make-sense The difficulty of getting started is very low , Just a few minutes , Players can master the function options in the workbench , Quickly enter the working state ; Besides , because make-sense Is a web application , Players of various operating systems can break the dimensional wall to realize work coordination .
a、 Create a label
Create a new one called labels The file of , According to the principle of one label per line , Input in sequence
The chestnuts are as follows :
b、MAKE SENSE Mark website usage
Open the web site 
Click to put the picture 
Select all the collected pictures and confirm 
Click the corresponding... According to the marking requirements , Here we click object detection 
Click on Load labels from file. Indicates that labels are imported in batch from the file 
Click after placing Create labels list
Finally, click Start project , You can start marking

Mark each picture in turn 
Export dimension results 
Select the export format , And export

The export package reference is as follows :
Here the pictures and labels are ready , You can prepare to start making datasets
3、 ... and 、 Make datasets
1、 Create folder
Create folder mydata
Its internal structure is as follows

2、 Copy in the previous pictures and marked data
test And train The general scale of the set is 2:8 or 3:7

3、 Create a new one mydata.yaml file ,

4、 modify train.py in data Parameters



边栏推荐
- [Yugong series] March 2022 asp Net core Middleware - conditional routing
- 数据库精选 60 道面试题
- How do I extract files from the software?
- 魏武帝 太祖知不可匡正,遂不复献言
- Enterprise Architect v16
- 疫情数据分析平台工作报告【6.5】疫情地图
- The memory four area model of C language program
- [software tool] [original] tutorial on using VOC dataset class alias batch modification tool
- [C language] analysis of variable essence
- R language uses the coxph function of survival package to build Cox regression model, uses the ggrisk function of ggrisk package to visualize the risk score map (risk score map) of Cox regression, and
猜你喜欢

数据库精选 60 道面试题
![[official testerhome] MTSC 2021 Shanghai and Shenzhen PPT download addresses](/img/a0/d1170b20d01a7a586d8ff68279f1d4.jpg)
[official testerhome] MTSC 2021 Shanghai and Shenzhen PPT download addresses

Esp32c3 remote serial port

分布式锁介绍

【C语言】变量本质分析

Mongodb essence summary

Enterprise Architect v16

Mysql主从搭建与Django实现读写分离

【FPGA混沌】基于FPGA的混沌系统verilog实现
![Work report on epidemic data analysis platform [7] Alibaba cloud related](/img/e2/acc79256f8f90ca730c39ffb941dab.png)
Work report on epidemic data analysis platform [7] Alibaba cloud related
随机推荐
Oracle:decode function
Brief introduction to 44 official cases of vrtk3.3 (combined with steamvr)
Solution en cas de défaillance du script Unity
疫情数据分析平台工作报告【4】跨域相关
Kotlin协程协程作用域,CoroutineScope MainScope GlobalScope viewModelScope lifecycleScope 分别代表什么
QT experiment - gold coin flipping games
Naive Bayes classification of scikit learn
命令执行漏洞详解
19.tornado项目之优化数据库查询
疫情数据分析平台工作报告【6.5】疫情地图
数据库新建表,以前没问题的,今天
Object detection model rfbnet -- a very useful model
Epidemic data analysis platform work report [2] interface API
Oracle paging query ~~rownum (line number)
R语言plotly可视化:plotly可视化分组(grouped)小提琴图(grouped violin plot in R with plotly)
千字巨著《编程后传》
How do I extract files from the software?
DS18B20数字温度计 (一) 电气特性, 供电和接线方式
【FPGA+GPS接收器】基于FPGA的双频GPS接收器详细设计介绍
[软件工具][原创]voc数据集类别名批量修改工具使用教程