当前位置:网站首页>Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
Based on holding YOLOv5 custom implementation of FacePose YOLO structure interpretation, YOLO data format conversion, YOLO process modification"
2022-08-05 03:26:00 【Burnt Bay】
导读:本篇记录如何在YOLOv5The process of implementing custom datasets and detections above.Starting from the original project data format,关注每个细节,And do the custom task again in the same format.The independent implementation migrates oneprojectto the new pit.
目录
wandb:可视化训练过程
tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, kpt=0.1, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
loggers['wandb'] = wandb_logger.wandb # train.pyVisualize weights and biases in ,An account needs to be created
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 1
wandb: You chose 'Create a W&B account'
wandb: Create an account here: https://wandb.ai/authorize?signup=true
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:
需要wandb官网注册,这里是用githubJoint registration is sufficient,and get a key
模型解析
这里介绍anchor设置,with the output of the detection head
def parse_model(d, ch): # model_dict, input_channels(3)
logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))
anchors, nc, nkpt, gd, gw = d['anchors'], d['nc'], d['nkpt'], d['depth_multiple'], d['width_multiple']
#anchor的数量,其anchors:[[19, 27, 44, 40, 38, 94], [96, 68, 86, 152, 180, 137], [140, 301, 303, 264, 238, 542], [436, 615, 739, 380, 925, 792]]
na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors na = 3
#Improvements to key points in the paper,3×(1+5+2×17)=3×40
no = na * (nc + 5 + 2*nkpt) # number of outputs = anchors * (classes + 5)
The optimizer parameters and Batch Size关系
# Optimizer
nbs = 64 # nominal batch size
accumulate = max(round(nbs / total_batch_size), 1) # accumulate loss before optimizing
#No modification is required herebatch—size而修改decay,The accumulated error is re-optimized
hyp['weight_decay'] *= total_batch_size * accumulate / nbs # scale weight_decay
logger.info(f"Scaled weight_decay = {
hyp['weight_decay']}")
图像增强
# class LoadImagesAndLabels(Dataset): # for training/testing
...
#马赛克增强
self.mosaic = self.augment and not self.rect # load 4 images at a time into a mosaic (only during training)
self.mosaic_border = [-img_size // 2, -img_size // 2]
self.stride = stride
self.path = path
self.kpt_label = kpt_label
#这里针对Keypointmake improvements.
self.flip_index = [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
COCO与YOLO格式转换
COCO原始格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
|-- train2017
| |-- 000000000009.jpg
| |-- 000000000025.jpg
| |-- 000000000030.jpg
| |-- ...
`-- val2017
|-- 000000000139.jpg
|-- 000000000285.jpg
|-- 000000000632.jpg
|-- ...
也就是说KeypointsThe labels are placed on the JSON文件中.We can take out a sample and analyze itJSON数据
JSONThe message contains the name of the picture、宽高、id等信息
{
"license": 4,
"file_name": "000000252219.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg",
"height": 428,"width": 640,
"date_captured": "2013-11-14 22:32:02",
"flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg",
"id": 252219
}
图片展示如下:
Its manually annotated information is as follows:
{
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
"category_id": 1,"id": 481918
}
我们可以发现,COCO格式中KeypointsThe annotation information of 3×num_keypoins组成,每个三元组格式为:[x,y,v],其中vfor visibility,means to:
- v=0,表示不可见,and unmarked,此时x=y=0;
- v=1,表示不可见,已标记;
- v=2,表示可见,已标记.
{
"num_keypoints": 15,
"area": 8349.28485,"iscrowd": 0,
"keypoints": [100,190,2,0,0,0,96,185,2,0,0,0,86,188,2,84,208,2,71,208,2,84,245,2,59,240,2,115,263,2,66,271,2,
64,268,2,71,264,2,59,324,2,99,322,2,18,363,2,101,377,2],
"image_id": 252219,
"bbox": [9.79,167.06,121.94,226.45],
"category_id": 1,
"id": 489768
}
bounding boxformat obeys**“xywh”**,即左上角坐标+宽+高
YOLO格式
${
POSE_ROOT}
|-- data
`-- |-- coco
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
|-- person_detection_results
| |-- COCO_val2017_detections_AP_H_56_person.json
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
Listed here"image_id": 252219的YOLO格式信息
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000 0.559375 0.450935 2.000000 0.548438 0.453271 2.000000 0.568750
0.448598 2.000000 0.540625 0.453271 2.000000 0.585938 0.483645 2.000000 0.532813 0.492991 2.000000 0.606250 0.551402 2.000000
0.525000 0.556075 2.000000 0.612500 0.614486 2.000000 0.535937 0.565421 2.000000 0.582812 0.633178 2.000000 0.542188 0.635514
2.000000 0.581250 0.738318 2.000000 0.543750 0.742991 2.000000 0.581250 0.824766 2.000000 0.554688 0.827103 2.000000
0 0.110562 0.654871 0.190531 0.529089 0.156250 0.443925 2.000000 0.000000 0.000000 0.000000 0.150000 0.432243 2.000000 0.000000
0.000000 0.000000 0.134375 0.439252 2.000000 0.131250 0.485981 2.000000 0.110937 0.485981 2.000000 0.131250 0.572430 2.000000
0.092188 0.560748 2.000000 0.179688 0.614486 2.000000 0.103125 0.633178 2.000000 0.100000 0.626168 2.000000 0.110937 0.616822
2.000000 0.092188 0.757009 2.000000 0.154688 0.752336 2.000000 0.028125 0.848131 2.000000 0.157812 0.880841 2.000000
0 0.894172 0.652220 0.193219 0.504112 0.837500 0.448598 1.000000 0.840625 0.439252 2.000000 0.000000 0.000000 0.000000 0.862500
0.443925 2.000000 0.000000 0.000000 0.000000 0.887500 0.483645 2.000000 0.867188 0.485981 2.000000 0.873437 0.567757 2.000000
0.865625 0.574766 2.000000 0.846875 0.630841 2.000000 0.859375 0.647196 2.000000 0.895312 0.640187 2.000000 0.873437 0.640187
2.000000 0.920312 0.754673 2.000000 0.845313 0.752336 2.000000 0.964063 0.852804 2.000000 0.828125 0.843458 2.000000
这里,JSON2YOLOFormat conversion function reference linkJSON2YOLO,其算法如下:
img = images['%g' % x['image_id']]
h, w, f = img['height'], img['width'], img['file_name']
# The COCO box format is [top left x, top left y, width, height]
box = np.array(x['bbox'], dtype=np.float64)
box[:2] += box[2:] / 2 # xy top-left corner to center
box[[0, 2]] /= w # normalize x
box[[1, 3]] /= h # normalize y
说明YOLOThe format is center point normalized,即XYWH,需要转为 C x C y C_xC_y CxCyWH(注意,At this point all points are normalized by the width and height of the image).我们按照上述COCO原始格式,See if you can get itYOLO格式:
"height": 428,"width": 640,
"num_keypoints": 17,
"area": 8511.1568, "iscrowd": 0,
"keypoints": [356,198,2,358,193,2,351,194,2,364,192,2,346,194,2,375,207,2,341,211,2,388,236,2,336,238,2,392,263,2,
343,242,2,373,271,2,347,272,2,372,316,2,348,318,2,372,353,2,355,354,2],
"image_id": 252219,
"bbox": [326.28,174.56,71.24,197.25],
通过上述算法,可以粗略估计:
bbox:(326+71/2)/640=0.5656, (174+197/2)/428=0.6355, 71/670=0.1109, 197/428=0.460
keypoints[0]: 356/640=0.5562, 198/428=0.4626
This has to do with turn intoYOLOThe result of the format is the same
0 0.565469 0.638283 0.111312 0.460864 0.556250 0.462617 2.000000
300-W转化YOLO格式
300-W人脸数据库,是包含68A popular database of human face keypoints,Its faces come from different datasets egafw、ibug等.其文件格式如下:
-- data
|-- data_300W
|-- afw
|-- helen
|-- ibug
|-- lfpw
|-- data
`-- |-- data_300W
`-- |-- annotations
|-- afw
|-- helen
|-- ibug
|-- lfpw
`-- images
| |-- train2017
| | |-- 000000000009.jpg
| | |-- 000000000025.jpg
| | |-- ...
| `-- val2017
| |-- 000000000139.jpg
| |-- 000000000285.jpg
| |-- ...
`-- labels
| |-- train2017
| | |-- 000000000009.txt
| | |-- 000000000025.txt #Pictured herekeypoint信息,以YOLO格式展示
| | |-- ...
| `-- val2017
| |-- 000000000139.txt
| |-- 000000000285.txt #Pictured herekeypoint信息,以YOLO格式展示
| |-- ...
`-- train2017.txt #The content here is:相对路径+图片名字
`-- val2017.txt #The content here is:相对路径+图片名字
300-W格式
查看data_300W/afw/1051618982_1.jpg
Corresponding to the above picture68Personal face mark is*.pt文件,打开如下
version: 1
n_points: 68
{
482.866335 268.009351
484.241455 298.524244
487.963820 329.985842
491.613829 359.446370
503.992490 387.443021
523.666182 409.551102
543.708366 429.090358
566.283098 442.751692
……
591.348649 385.406662
580.068281 384.385348
563.609110 379.281936
552.917511 366.852392
580.508062 371.198816
592.309498 371.492218
604.011866 371.855814
634.952400 369.536292
604.011866 371.855814
592.309498 371.492218
580.508062 371.198816
}
一共68a binary pair ( x i , y i ) (x_i,y_i) (xi,yi),为方便展示,Some value pairs in the middle are omitted.而coco2yolo格式如下所示,即:
0 xywh (x, y)
| | |
| | ` - - Coordinates normalized to the width and height of the image | ` - - 归一化的bounding box,中心点坐标xywith the width and height of the boxwh
` - - iscrowd:Whether the crowded scene,0,N;1,yes.
300-W格式转YOLO格式
也就是说,需要将上述68The data of face key points are transformed into coco2yolo格式.这里,我们参考PIPNetthe preprocessed text,将300WFolders are fully converted to COCOsimilar file format,Include the file target format.This is done to avoid as much as possibleyolo中代码修改.
至此,This format was converted successfully.
工程修改(Pit recording)
YOLOThere are quite a few changes involved,主要在几个方面:
- 数据集读取;
- Detection head modification;
去修改launch文件相关配置;
去修改data/coco_kepts.yamlThe data read path in the file.
去修改models/hub/cfg文件,如yolo5s6_kpts.yamlThe relevant parameters in the :nkpt 从17change68;
去修改dataset第497行,有关如何读取txt数据的;
去修改dataset第987行,about how the data changes;
修改dataset第365行,有关如何flip数据;
修改loss函数第187,和202行,有关loss_gain;
loss函数中第119行,有关sigmas是直接写死的,都写成1算了;
plots函数中第76、84行,有关plot的问题,Not done yet,Forget drawing;
修改yolo函数第90行,有关self.inplace
train log
autoanchor: Analyzing anchors... anchors/target = 7.86, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 4 dataloader workers
Logging results to runs/train/exp10
Starting training for 300 epochs...
Epoch gpu_mem box obj cls kpt kptv total labels img_size
0/299 4.22G 0.07731 0.0573 0 0.3465 0.01299 0.4941 10 640: 100%| 787/787 [02:58<00:00, 4.41it/s]
Class Images Labels P R [email protected] [email protected]:.95: 100%| 87/87 [00:14<00:00, 6.05it/s]
all 689 689 0.0073 0.691 0.00784 0.00137
……
一个epoch需要3mins,共300个epoch; Looking forward to the results!
待续
The project finally passed the debugging!More details will be released gradually.
边栏推荐
- Static method to get configuration file data
- High Item 02 Information System Project Management Fundamentals
- CPDA|How Operators Learn Data Analysis (SQL) from Negative Foundations
- STM32 uses stm32cubemx LL library series tutorial
- 十五. 实战——mysql建库建表 字符集 和 排序规则
- In 2022, you still can't "low code"?Data science can also play with Low-Code!
- [Software testing] unittest framework for automated testing
- The usage of try...catch and finally in js
- Fifteen. Actual combat - MySQL database building table character set and collation
- On governance and innovation, the 2022 OpenAtom Global Open Source Summit OpenAnolis sub-forum came to a successful conclusion
猜你喜欢

After the large pixel panorama is completed, what are the promotion methods?

Walter talked little knowledge | "remote passthrough" that something

presto启动成功后出现2022-08-04T17:50:58.296+0800 ERROR Announcer-3 io.airlift.discovery.client.Announcer

基于生长的棋盘格角点检测方法

Simple description of linked list and simple implementation of code

Countdown to 2 days|Cloud native Meetup Guangzhou Station, waiting for you!

Confessing the era of digital transformation, Speed Cloud engraves a new starting point for value

QT language file production

YYGH-13-Customer Service Center

引领数字医学高地,中山医院探索打造未来医院“新范式”
随机推荐
IJCAI2022 | DictBert: Pre-trained Language Models with Contrastive Learning for Dictionary Description Knowledge Augmentation
Question about #sql shell#, how to solve it?
Physical backup issues caused by soft links
.NET Application -- Helloworld (C#)
How to sort multiple fields and multiple values in sql statement
Confessing the era of digital transformation, Speed Cloud engraves a new starting point for value
rpc-remote procedure call demo
Call Alibaba Cloud oss and sms services
public static <T> List<T> asList(T... a) 原型是怎么回事?
思考(八十八):使用 protobuf 自定义选项,做数据多版本管理
[Storage] Dawning Storage DS800-G35 ISCSI maps each LUN to the server
2022 High-level installation, maintenance, and removal of exam questions mock exam question bank and online mock exam
Native js realizes the effect of selecting and canceling all the multi-select boxes
From "useable" to "easy to use", domestic software is self-controllable and continues to advance
YYGH-13-Customer Service Center
惨遭打脸:字节某部门竟有这么多测试员
[Paper Notes] MapReduce: Simplified Data Processing on Large Clusters
Summary of domestic environments supported by SuperMap
You may use special comments to disable some warnings. 报错解决的三种方式
Web3.0 Dapps——通往未来金融世界的道路