当前位置:网站首页>Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
2022-08-05 03:26:00 【Burnt Bay】
导读:本篇记录如何在YOLOv5The process of implementing custom datasets and detections above.Starting from the original project data format,关注每个细节,And do the custom task again in the same format.The independent implementation migrates oneprojectto the new pit.
目录
wandb:可视化训练过程
tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, kpt=0.1, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
loggers['wandb'] = wandb_logger.wandb # train.pyVisualize weights and biases in ,An account needs to be created
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 1
wandb: You chose 'Create a W&B account'
wandb: Create an account here: https://wandb.ai/authorize?signup=true
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:
需要wandb官网注册,这里是用githubJoint registration is sufficient,and get a key
模型解析
这里介绍anchor设置,with the output of the detection head
def parse_model(d, ch): # model_dict, input_channels(3)
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
anchors, nc, nkpt, gd, gw = d['anchors'], d['nc'], d['nkpt'], d['depth_multiple'], d['width_multiple']
#anchor的数量,其anchors:[[19, 27, 44, 40, 38, 94], [96, 68, 86, 152, 180, 137], [140, 301, 303, 264, 238, 542], [436, 615, 739, 380, 925, 792]]
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors na = 3
#Improvements to key points in the paper,3×(1+5+2×17)=3×40
no = na * (nc + 5 + 2*nkpt) # number of outputs = anchors * (classes + 5)
The optimizer parameters and Batch Size关系
# Optimizer
nbs = 64 # nominal batch size
accumulate = max(round(nbs / total_batch_size), 1) # accumulate loss before optimizing
#No modification is required herebatch—size而修改decay,The accumulated error is re-optimized
hyp['weight_decay'] *= total_batch_size * accumulate / nbs # scale weight_decay
logger.info(f"Scaled weight_decay = {
hyp['weight_decay']}")
图像增强
# class LoadImagesAndLabels(Dataset): # for training/testing
...
#马赛克增强
self.mosaic = self.augment and not self.rect # load 4 images at a time into a mosaic (only during training)
self.mosaic_border = [-img_size // 2, -img_size // 2]
self.stride = stride
self.path = path
self.kpt_label = kpt_label
#这里针对Keypointmake improvements.
self.flip_index = [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
COCO与YOLO格式转换
COCO原始格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
|-- train2017
| |-- 000000000009.jpg
| |-- 000000000025.jpg
| |-- 000000000030.jpg
| |-- ...
`-- val2017
|-- 000000000139.jpg
|-- 000000000285.jpg
|-- 000000000632.jpg
|-- ...
也就是说KeypointsThe labels are placed on the JSON文件中.We can take out a sample and analyze itJSON数据
JSONThe message contains the name of the picture、宽高、id等信息
{
"license": 4,
"file_name": "000000252219.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
"height": 428,"width": 640,
"date_captured": "2013-11-14 22:32:02",
"flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
"id": 252219
}
图片展示如下:
Its manually annotated information is as follows:
{
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
"category_id": 1,"id": 481918
}
我们可以发现,COCO格式中KeypointsThe annotation information of 3×num_keypoins组成,每个三元组格式为:[x,y,v],其中vfor visibility,means to:
- v=0,表示不可见,and unmarked,此时x=y=0;
- v=1,表示不可见,已标记;
- v=2,表示可见,已标记.
{
"num_keypoints": 15,
"area": 8349.28485,"iscrowd": 0,
"keypoints": [100,190,2,0,0,0,96,185,2,0,0,0,86,188,2,84,208,2,71,208,2,84,245,2,59,240,2,115,263,2,66,271,2,
64,268,2,71,264,2,59,324,2,99,322,2,18,363,2,101,377,2],
"image_id": 252219,
"bbox": [9.79,167.06,121.94,226.45],
"category_id": 1,
"id": 489768
}
bounding boxformat obeys**“xywh”**,即左上角坐标+宽+高
YOLO格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
Listed here"image_id": 252219的YOLO格式信息
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000 0.559375 0.450935 2.000000 0.548438 0.453271 2.000000 0.568750
0.448598 2.000000 0.540625 0.453271 2.000000 0.585938 0.483645 2.000000 0.532813 0.492991 2.000000 0.606250 0.551402 2.000000
0.525000 0.556075 2.000000 0.612500 0.614486 2.000000 0.535937 0.565421 2.000000 0.582812 0.633178 2.000000 0.542188 0.635514
2.000000 0.581250 0.738318 2.000000 0.543750 0.742991 2.000000 0.581250 0.824766 2.000000 0.554688 0.827103 2.000000
0 0.110562 0.654871 0.190531 0.529089 0.156250 0.443925 2.000000 0.000000 0.000000 0.000000 0.150000 0.432243 2.000000 0.000000
0.000000 0.000000 0.134375 0.439252 2.000000 0.131250 0.485981 2.000000 0.110937 0.485981 2.000000 0.131250 0.572430 2.000000
0.092188 0.560748 2.000000 0.179688 0.614486 2.000000 0.103125 0.633178 2.000000 0.100000 0.626168 2.000000 0.110937 0.616822
2.000000 0.092188 0.757009 2.000000 0.154688 0.752336 2.000000 0.028125 0.848131 2.000000 0.157812 0.880841 2.000000
0 0.894172 0.652220 0.193219 0.504112 0.837500 0.448598 1.000000 0.840625 0.439252 2.000000 0.000000 0.000000 0.000000 0.862500
0.443925 2.000000 0.000000 0.000000 0.000000 0.887500 0.483645 2.000000 0.867188 0.485981 2.000000 0.873437 0.567757 2.000000
0.865625 0.574766 2.000000 0.846875 0.630841 2.000000 0.859375 0.647196 2.000000 0.895312 0.640187 2.000000 0.873437 0.640187
2.000000 0.920312 0.754673 2.000000 0.845313 0.752336 2.000000 0.964063 0.852804 2.000000 0.828125 0.843458 2.000000
这里,JSON2YOLOFormat conversion function reference linkJSON2YOLO,其算法如下:
img = images['%g' % x['image_id']]
h, w, f = img['height'], img['width'], img['file_name']
# The COCO box format is [top left x, top left y, width, height]
box = np.array(x['bbox'], dtype=np.float64)
box[:2] += box[2:] / 2 # xy top-left corner to center
box[[0, 2]] /= w # normalize x
box[[1, 3]] /= h # normalize y
说明YOLOThe format is center point normalized,即XYWH,需要转为 C x C y C_xC_y CxCyWH(注意,At this point all points are normalized by the width and height of the image).我们按照上述COCO原始格式,See if you can get itYOLO格式:
"height": 428,"width": 640,
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
通过上述算法,可以粗略估计:
bbox:(326+71/2)/640=0.5656, (174+197/2)/428=0.6355, 71/670=0.1109, 197/428=0.460
keypoints[0]: 356/640=0.5562, 198/428=0.4626
This has to do with turn intoYOLOThe result of the format is the same
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000
300-W转化YOLO格式
300-W人脸数据库,是包含68A popular database of human face keypoints,Its faces come from different datasets egafw、ibug等.其文件格式如下:
-- data
|-- data_300W
|-- afw
|-- helen
|-- ibug
|-- lfpw
|-- data
`-- |-- data_300W
`-- |-- annotations
|-- afw
|-- helen
|-- ibug
|-- lfpw
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
300-W格式
查看data_300W/afw/1051618982_1.jpg
Corresponding to the above picture68Personal face mark is*.pt文件,打开如下
version: 1
n_points: 68
{
482.866335 268.009351
484.241455 298.524244
487.963820 329.985842
491.613829 359.446370
503.992490 387.443021
523.666182 409.551102
543.708366 429.090358
566.283098 442.751692
……
591.348649 385.406662
580.068281 384.385348
563.609110 379.281936
552.917511 366.852392
580.508062 371.198816
592.309498 371.492218
604.011866 371.855814
634.952400 369.536292
604.011866 371.855814
592.309498 371.492218
580.508062 371.198816
}
一共68a binary pair ( x i , y i ) (x_i,y_i) (xi,yi),为方便展示,Some value pairs in the middle are omitted.而coco2yolo格式如下所示,即:
0 xywh (x, y)
| | |
| | ` - - Coordinates normalized to the width and height of the image | ` - - 归一化的bounding box,中心点坐标xywith the width and height of the boxwh
` - - iscrowd:Whether the crowded scene,0,N;1,yes.
300-W格式转YOLO格式
也就是说,需要将上述68The data of face key points are transformed into coco2yolo格式.这里,我们参考PIPNetthe preprocessed text,将300WFolders are fully converted to COCOsimilar file format,Include the file target format.This is done to avoid as much as possibleyolo中代码修改.
至此,This format was converted successfully.
工程修改(Pit recording)
YOLOThere are quite a few changes involved,主要在几个方面:
- 数据集读取;
- Detection head modification;
去修改launch文件相关配置;
去修改data/coco_kepts.yamlThe data read path in the file.
去修改models/hub/cfg文件,如yolo5s6_kpts.yamlThe relevant parameters in the :nkpt 从17change68;
去修改dataset第497行,有关如何读取txt数据的;
去修改dataset第987行,about how the data changes;
修改dataset第365行,有关如何flip数据;
修改loss函数第187,和202行,有关loss_gain;
loss函数中第119行,有关sigmas是直接写死的,都写成1算了;
plots函数中第76、84行,有关plot的问题,Not done yet,Forget drawing;
修改yolo函数第90行,有关self.inplace
train log
autoanchor: Analyzing anchors... anchors/target = 7.86, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 4 dataloader workers
Logging results to runs/train/exp10
Starting training for 300 epochs...
Epoch gpu_mem box obj cls kpt kptv total labels img_size
0/299 4.22G 0.07731 0.0573 0 0.3465 0.01299 0.4941 10 640: 100%| 787/787 [02:58<00:00, 4.41it/s]
Class Images Labels P R [email protected] [email protected]:.95: 100%| 87/87 [00:14<00:00, 6.05it/s]
all 689 689 0.0073 0.691 0.00784 0.00137
……
一个epoch需要3mins,共300个epoch; Looking forward to the results!
待续
The project finally passed the debugging!More details will be released gradually.
边栏推荐
- public static <T> List<T> asList(T... a) 原型是怎么回事?
- 引领数字医学高地,中山医院探索打造未来医院“新范式”
- 论治理与创新,2022 开放原子全球开源峰会 OpenAnolis 分论坛圆满落幕
- Common open source databases under Linux, how many do you know?
- Initial solution of the structure
- Step by step how to perform data risk assessment
- Countdown to 2 days|Cloud native Meetup Guangzhou Station, waiting for you!
- public static
List asList(T... a) What is the prototype? - 用Unity发布APP到Hololens2无坑教程
- Physical backup issues caused by soft links
猜你喜欢
presto启动成功后出现2022-08-04T17:50:58.296+0800 ERROR Announcer-3 io.airlift.discovery.client.Announcer
2022-08-04 第六小组 瞒春 学习笔记
YYGH-13-客服中心
Tencent Cloud [Hiflow] New Era Automation Tool
Open-Falcon of operation and maintenance monitoring system
新人如何入门和学习软件测试?
Flink 1.15.1 Cluster Construction (StandaloneSession)
leetcode-每日一题1403. 非递增顺序的最小子序列(贪心)
.NET应用程序--Helloworld(C#)
【七夕节】浪漫七夕,代码传情。将爱意变成绚烂的立体场景,给她(他)一个惊喜!(送代码)
随机推荐
【七夕节】浪漫七夕,代码传情。将爱意变成绚烂的立体场景,给她(他)一个惊喜!(送代码)
Developing Hololens encountered The type or namespace name 'HandMeshVertex' could not be found..
AI + Small Nucleic Acid Drugs | Eleven Completes $22 Million Seed Round Financing
How to solve the error cannot update secondary snapshot during a parallel operation when the PostgreSQL database uses navicat to open the table structure?
How to sort multiple fields and multiple values in sql statement
结构体初解
sql server 安装提示用户名不存在
presto启动成功后出现2022-08-04T17:50:58.296+0800 ERROR Announcer-3 io.airlift.discovery.client.Announcer
Kubernetes 网络入门
Slapped in the face: there are so many testers in a certain department of byte
Leading the highland of digital medicine, Zhongshan Hospital explores to create a "new paradigm" for future hospitals
【 genius_platform software platform development 】 : seventy-six vs the preprocessor definitions written cow force!!!!!!!!!!(in the other groups conding personnel told so cow force configuration to can
2022 Hangzhou Electric Multi-School 1st Game
告白数字化转型时代,时速云镌刻价值新起点
dmp (dump) dump file
冰蝎V4.0攻击来袭,安全狗产品可全面检测
How to find all fields with empty data in sql
Static method to get configuration file data
Question about #sql shell#, how to solve it?
The pit of std::string::find return value