当前位置:网站首页>Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
2022-08-05 03:26:00 【Burnt Bay】
导读:本篇记录如何在YOLOv5The process of implementing custom datasets and detections above.Starting from the original project data format,关注每个细节,And do the custom task again in the same format.The independent implementation migrates oneprojectto the new pit.
目录
wandb:可视化训练过程
tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, kpt=0.1, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
loggers['wandb'] = wandb_logger.wandb # train.pyVisualize weights and biases in ,An account needs to be created
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 1
wandb: You chose 'Create a W&B account'
wandb: Create an account here: https://wandb.ai/authorize?signup=true
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:
需要wandb官网注册,这里是用githubJoint registration is sufficient,and get a key
模型解析
这里介绍anchor设置,with the output of the detection head
def parse_model(d, ch): # model_dict, input_channels(3)
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
anchors, nc, nkpt, gd, gw = d['anchors'], d['nc'], d['nkpt'], d['depth_multiple'], d['width_multiple']
#anchor的数量,其anchors:[[19, 27, 44, 40, 38, 94], [96, 68, 86, 152, 180, 137], [140, 301, 303, 264, 238, 542], [436, 615, 739, 380, 925, 792]]
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors na = 3
#Improvements to key points in the paper,3×(1+5+2×17)=3×40
no = na * (nc + 5 + 2*nkpt) # number of outputs = anchors * (classes + 5)
The optimizer parameters and Batch Size关系
# Optimizer
nbs = 64 # nominal batch size
accumulate = max(round(nbs / total_batch_size), 1) # accumulate loss before optimizing
#No modification is required herebatch—size而修改decay,The accumulated error is re-optimized
hyp['weight_decay'] *= total_batch_size * accumulate / nbs # scale weight_decay
logger.info(f"Scaled weight_decay = {
hyp['weight_decay']}")
图像增强
# class LoadImagesAndLabels(Dataset): # for training/testing
...
#马赛克增强
self.mosaic = self.augment and not self.rect # load 4 images at a time into a mosaic (only during training)
self.mosaic_border = [-img_size // 2, -img_size // 2]
self.stride = stride
self.path = path
self.kpt_label = kpt_label
#这里针对Keypointmake improvements.
self.flip_index = [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
COCO与YOLO格式转换
COCO原始格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
|-- train2017
| |-- 000000000009.jpg
| |-- 000000000025.jpg
| |-- 000000000030.jpg
| |-- ...
`-- val2017
|-- 000000000139.jpg
|-- 000000000285.jpg
|-- 000000000632.jpg
|-- ...
也就是说KeypointsThe labels are placed on the JSON文件中.We can take out a sample and analyze itJSON数据
JSONThe message contains the name of the picture、宽高、id等信息
{
"license": 4,
"file_name": "000000252219.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
"height": 428,"width": 640,
"date_captured": "2013-11-14 22:32:02",
"flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
"id": 252219
}
图片展示如下:
Its manually annotated information is as follows:
{
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
"category_id": 1,"id": 481918
}
我们可以发现,COCO格式中KeypointsThe annotation information of 3×num_keypoins组成,每个三元组格式为:[x,y,v],其中vfor visibility,means to:
- v=0,表示不可见,and unmarked,此时x=y=0;
- v=1,表示不可见,已标记;
- v=2,表示可见,已标记.
{
"num_keypoints": 15,
"area": 8349.28485,"iscrowd": 0,
"keypoints": [100,190,2,0,0,0,96,185,2,0,0,0,86,188,2,84,208,2,71,208,2,84,245,2,59,240,2,115,263,2,66,271,2,
64,268,2,71,264,2,59,324,2,99,322,2,18,363,2,101,377,2],
"image_id": 252219,
"bbox": [9.79,167.06,121.94,226.45],
"category_id": 1,
"id": 489768
}
bounding boxformat obeys**“xywh”**,即左上角坐标+宽+高
YOLO格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
Listed here"image_id": 252219的YOLO格式信息
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000 0.559375 0.450935 2.000000 0.548438 0.453271 2.000000 0.568750
0.448598 2.000000 0.540625 0.453271 2.000000 0.585938 0.483645 2.000000 0.532813 0.492991 2.000000 0.606250 0.551402 2.000000
0.525000 0.556075 2.000000 0.612500 0.614486 2.000000 0.535937 0.565421 2.000000 0.582812 0.633178 2.000000 0.542188 0.635514
2.000000 0.581250 0.738318 2.000000 0.543750 0.742991 2.000000 0.581250 0.824766 2.000000 0.554688 0.827103 2.000000
0 0.110562 0.654871 0.190531 0.529089 0.156250 0.443925 2.000000 0.000000 0.000000 0.000000 0.150000 0.432243 2.000000 0.000000
0.000000 0.000000 0.134375 0.439252 2.000000 0.131250 0.485981 2.000000 0.110937 0.485981 2.000000 0.131250 0.572430 2.000000
0.092188 0.560748 2.000000 0.179688 0.614486 2.000000 0.103125 0.633178 2.000000 0.100000 0.626168 2.000000 0.110937 0.616822
2.000000 0.092188 0.757009 2.000000 0.154688 0.752336 2.000000 0.028125 0.848131 2.000000 0.157812 0.880841 2.000000
0 0.894172 0.652220 0.193219 0.504112 0.837500 0.448598 1.000000 0.840625 0.439252 2.000000 0.000000 0.000000 0.000000 0.862500
0.443925 2.000000 0.000000 0.000000 0.000000 0.887500 0.483645 2.000000 0.867188 0.485981 2.000000 0.873437 0.567757 2.000000
0.865625 0.574766 2.000000 0.846875 0.630841 2.000000 0.859375 0.647196 2.000000 0.895312 0.640187 2.000000 0.873437 0.640187
2.000000 0.920312 0.754673 2.000000 0.845313 0.752336 2.000000 0.964063 0.852804 2.000000 0.828125 0.843458 2.000000
这里,JSON2YOLOFormat conversion function reference linkJSON2YOLO,其算法如下:
img = images['%g' % x['image_id']]
h, w, f = img['height'], img['width'], img['file_name']
# The COCO box format is [top left x, top left y, width, height]
box = np.array(x['bbox'], dtype=np.float64)
box[:2] += box[2:] / 2 # xy top-left corner to center
box[[0, 2]] /= w # normalize x
box[[1, 3]] /= h # normalize y
说明YOLOThe format is center point normalized,即XYWH,需要转为 C x C y C_xC_y CxCyWH(注意,At this point all points are normalized by the width and height of the image).我们按照上述COCO原始格式,See if you can get itYOLO格式:
"height": 428,"width": 640,
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
通过上述算法,可以粗略估计:
bbox:(326+71/2)/640=0.5656, (174+197/2)/428=0.6355, 71/670=0.1109, 197/428=0.460
keypoints[0]: 356/640=0.5562, 198/428=0.4626
This has to do with turn intoYOLOThe result of the format is the same
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000
300-W转化YOLO格式
300-W人脸数据库,是包含68A popular database of human face keypoints,Its faces come from different datasets egafw、ibug等.其文件格式如下:
-- data
|-- data_300W
|-- afw
|-- helen
|-- ibug
|-- lfpw
|-- data
`-- |-- data_300W
`-- |-- annotations
|-- afw
|-- helen
|-- ibug
|-- lfpw
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
300-W格式
查看data_300W/afw/1051618982_1.jpg
Corresponding to the above picture68Personal face mark is*.pt文件,打开如下
version: 1
n_points: 68
{
482.866335 268.009351
484.241455 298.524244
487.963820 329.985842
491.613829 359.446370
503.992490 387.443021
523.666182 409.551102
543.708366 429.090358
566.283098 442.751692
……
591.348649 385.406662
580.068281 384.385348
563.609110 379.281936
552.917511 366.852392
580.508062 371.198816
592.309498 371.492218
604.011866 371.855814
634.952400 369.536292
604.011866 371.855814
592.309498 371.492218
580.508062 371.198816
}
一共68a binary pair ( x i , y i ) (x_i,y_i) (xi,yi),为方便展示,Some value pairs in the middle are omitted.而coco2yolo格式如下所示,即:
0 xywh (x, y)
| | |
| | ` - - Coordinates normalized to the width and height of the image | ` - - 归一化的bounding box,中心点坐标xywith the width and height of the boxwh
` - - iscrowd:Whether the crowded scene,0,N;1,yes.
300-W格式转YOLO格式
也就是说,需要将上述68The data of face key points are transformed into coco2yolo格式.这里,我们参考PIPNetthe preprocessed text,将300WFolders are fully converted to COCOsimilar file format,Include the file target format.This is done to avoid as much as possibleyolo中代码修改.
至此,This format was converted successfully.
工程修改(Pit recording)
YOLOThere are quite a few changes involved,主要在几个方面:
- 数据集读取;
- Detection head modification;
去修改launch文件相关配置;
去修改data/coco_kepts.yamlThe data read path in the file.
去修改models/hub/cfg文件,如yolo5s6_kpts.yamlThe relevant parameters in the :nkpt 从17change68;
去修改dataset第497行,有关如何读取txt数据的;
去修改dataset第987行,about how the data changes;
修改dataset第365行,有关如何flip数据;
修改loss函数第187,和202行,有关loss_gain;
loss函数中第119行,有关sigmas是直接写死的,都写成1算了;
plots函数中第76、84行,有关plot的问题,Not done yet,Forget drawing;
修改yolo函数第90行,有关self.inplace
train log
autoanchor: Analyzing anchors... anchors/target = 7.86, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 4 dataloader workers
Logging results to runs/train/exp10
Starting training for 300 epochs...
Epoch gpu_mem box obj cls kpt kptv total labels img_size
0/299 4.22G 0.07731 0.0573 0 0.3465 0.01299 0.4941 10 640: 100%| 787/787 [02:58<00:00, 4.41it/s]
Class Images Labels P R [email protected] [email protected]:.95: 100%| 87/87 [00:14<00:00, 6.05it/s]
all 689 689 0.0073 0.691 0.00784 0.00137
……
一个epoch需要3mins,共300个epoch; Looking forward to the results!
待续
The project finally passed the debugging!More details will be released gradually.
边栏推荐
- 队列题目:最近的请求次数
- Why is the pca component not associated
- 今年七夕,「情蔬」比礼物更有爱
- Slapped in the face: there are so many testers in a certain department of byte
- 为什么pca分量没有关联
- 剑指Offer--找出数组中重复的数字(三种解法)
- Leading the highland of digital medicine, Zhongshan Hospital explores to create a "new paradigm" for future hospitals
- Developing Hololens encountered The type or namespace name 'HandMeshVertex' could not be found..
- 2022.8.4-----leetcode.1403
- After the large pixel panorama is completed, what are the promotion methods?
猜你喜欢
The second council meeting of the Dragon Lizard Community was successfully held!Director general election, 4 special consultants joined
CPDA|How Operators Learn Data Analysis (SQL) from Negative Foundations
MRTK3 develops Hololens application - gesture drag, rotate, zoom object implementation
Detailed and comprehensive postman interface testing practical tutorial
为什么pca分量没有关联
今年七夕,「情蔬」比礼物更有爱
【已解决】Unity Coroutinue 协程未有效执行的问题
Dynamic management of massive service instances
Confessing the era of digital transformation, Speed Cloud engraves a new starting point for value
引领数字医学高地,中山医院探索打造未来医院“新范式”
随机推荐
冒泡排序与快速排序
Hash table lookup (hash table)
sql server installation prompts that the username does not exist
IJCAI2022 | DictBert: Pre-trained Language Models with Contrastive Learning for Dictionary Description Knowledge Augmentation
Use CH341A to program external Flash (W25Q16JV)
Syntax basics (variables, input and output, expressions and sequential statement completion)
Simple description of linked list and simple implementation of code
运维监控系统之Open-Falcon
2022.8.4-----leetcode.1403
Summary of domestic environments supported by SuperMap
QT MV\MVC structure
新人如何入门和学习软件测试?
Tencent Cloud [Hiflow] New Era Automation Tool
[Filter tracking] based on matlab unscented Kalman filter inertial navigation + DVL combined navigation [including Matlab source code 2019]
数学-求和符号的性质
905. 区间选点
Intersection of Boolean Operations in SuperMap iDesktop.Net - Repairing Complex Models with Topological Errors
The linear table lookup
【七夕节】浪漫七夕,代码传情。将爱意变成绚烂的立体场景,给她(他)一个惊喜!(送代码)
2022.8.4-----leetcode.1403