当前位置:网站首页>Baidu flying general BMN timing action positioning framework | data preparation and training guide (Part 1)
Baidu flying general BMN timing action positioning framework | data preparation and training guide (Part 1)
2022-07-07 01:39:00 【Xinxu】
One 、 Introduce
BMN The model is developed by Baidu ,2019 year ActivityNet Winning scheme , In the problem of video action positioning proposal The generation of provides an efficient solution .
In short , The timing action positioning of video is to give a video , Analyze from xxx Seconds to xxx What action is a second , Compared with action recognition, it is necessary to infer the start time and end time of this action , The indicators mainly involve two :(1) Classification accuracy (2) And GT Of IoU.
Project address :
The algorithm is mainly divided into three stages :
(1) Video understanding
PP-TSM, Audio features :VGGish
(2) Timing nomination
BMN
(3) Action classification and positioning
AttentionLSTM
Each stage includes data preparation 、 Training 、 Verification and derivation of reasoning model .
The preparation environment mainly depends on requirements.txt The content inside is installed , Basically no problem ,paddlepaddle-gpu You'd better install the latest version .
Two 、PP-TSM
The dataset uses FootballAction Open source football action data set of flying oars
Data set from EuroCup2012, EuroCup2016, WorldCup2014, WorldCup2018 The competition video of the four events is composed , total 272 Training set 、25 Test set , Support 15 Positioning and recognition of wonderful football moves , The action categories are : A shot 、 goal 、 There are cheers for the goal 、 Corner kick 、 free kick 、 A yellow card 、 The red card 、 A penalty 、 substitutions 、 Out of bounds 、 Goal ball 、 Kick off 、 Flag waving offside 、 Replay air confrontation and replay goals .
In the project, not all the data of the propeller are open source , It's open source altogether 49 Data sets .
(1) Download datasets
Use bash File download , The download script file is located in PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/download_dataset.sh, After giving the file permission to execute, you can run it directly , When the download is complete, it will be in PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/mp4 Under this folder 49 individual MP4 video , total 78.1GB size . The marked data is directly given in the project file :
datasets/EuroCup2016/label.json List of tags for classification
datasets/EuroCup2016/label_cls8_train.json Tag the training data
datasets/EuroCup2016/label_cls8_train.json To validate data labels
datasets/EuroCup2016/url.list List of files for training data
datasets/EuroCup2016/url_val.list To verify the data file list
(2) Prepare the data
In the first stage, you need to prepare PP-TSM Training data , Use the following command :
Before that, there needs to be ffmpeg Environmental Science ,sudo apt install ffmpeg
cd PaddleVideo-develop/applications/FootballAction/datasets/script
python get_frames_pcm.py
This step is to sample the original video file , Image sampling is in seconds 5 The frequency of the frame , Audio sampling is based on 16000 The frequency of . It takes a long time to deal with , After processing, two new folders will be generated :
|-- datasets # Training data sets and processing scripts
|-- EuroCup2016 # Data sets
|-- mp4 # The original video .mp4
|-- frames # image frame ( new )
|-- pcm # Audio pcm( new )
|-- url.list # Video list
|-- label.json # The original video gts
(3) Process sampling
Process the above sampling data into PP-TSM Training data sets for
cd PaddleVideo-develop/applications/FootballAction/datasets/script
python get_instance_for_pptsm.py
This step is to take the motion interval as a positive sample according to the annotation , All frames in the interval generate a pkl file , The non motion interval is taken as a negative sample , Random sampling N Intervals generate N individual pkl file
After that step :
|-- datasets # Training data sets and processing scripts
|-- EuroCup2016 # Data sets
|-- input_for_pptsm # pptsm Training data ( new )
(4) Training PP-TSM
First, you need to download a pre training weight :
cd PaddleVideo-develop/applications/FootballAction
wget https://videotag.bj.bcebos.com/PaddleVideo/PretrainModel/ResNet50_vd_ssld_v2_pretrained.pdparams
mkdir pretrain
mv ResNet50_vd_ssld_v2_pretrained.pdparams pretrain/ResNet50_vd_ssld_v2_pretrained.pdparams
Open the training Profile :
PaddleVideo-develop/applications/FootballAction/train_proposal/configs/pptsm_football_v2.0.yaml
The first 5 That's ok : Write the location of the pre training model just downloaded , Note the absolute path
The first 17,18 That's ok :batchsize size , I am a 2080Ti-8G, Can only write 4/4
The first 19 That's ok : Change it to 1
The first 23 That's ok : find PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/input_for_pptsm/train.list This file , Then write his absolute path
The first 28 That's ok : find PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/input_for_pptsm/val.list, Then write his absolute path , This is actually what was just (3) The index file generated in that step
The first 33 That's ok : and 28 All right , Just write the same thing
For single card, use the following command to start training :
python -B -m paddle.distributed.launch --gpus="0" --log_dir=./football/logs_pptsm main.py --validate -c applications/FootballAction/train_proposal/configs/pptsm_football_v2.0.yaml -o output_dir=./football/pptsm
Probably need 3 God 3 Night training complete , Next, change the code to reasoning mode :
Before switching to prediction mode , Need modification
PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py
file , take init and infer_step The functions are updated to the following code :
def __init__(self, backbone=None, head=None):
super().__init__(backbone=backbone, head=head)
self.avgpool2d = paddle.nn.AdaptiveAvgPool2D((1, 1), data_format='NCHW')
def infer_step(self, data_batch):
"""Define how the model is going to test, from input to output."""
imgs = data_batch[0]
imgs = paddle.reshape_(imgs, [-1] + list(imgs.shape[2:]))
feature = self.backbone(imgs)
feat = self.avgpool2d(feature)
return feat
stay PaddleVideo Root execution
python tools/export_model.py -c applications/FootballAction/train_proposal/configs/pptsm_football_v2.0.yaml \
-p ./football/pptsm/ppTSM_best.pdparams \
-o ./football/inference_model
The reasoning model can be derived
(5) To configure PP-TSM
take
PaddleVideo/applications/FootballAction/predict/action_detect/models/pptsm_infer.py
In file 41 Yes
self.output_tensor = self.predictor.get_output_handle(output_names[1])
Replace with
self.output_tensor = self.predictor.get_output_handle(output_names[0])
Feature extraction of image and audio , Because we use the weights we just trained to extract features , So you need to modify the configuration file :
stay PaddleVideo-develop/applications/FootballAction/extractor/configs/configs.yaml In this document ,
The first 4 All right index_label_football_8.json The path of is configured as PaddleVideo-develop/applications/FootballAction/extractor/configs/index_label_football_8.json The absolute path of
The first 13 OK, change the default weight road strength to PaddleVideo-develop/football/inference_model/ppTSM.pdmodel The absolute path of
The first 14 Line change the default parameter file to PaddleVideo-develop/football/inference_model/ppTSM.pdiparams The absolute path of
The first 29 The audio model weight path of the row is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/AUDIO/__model__ The absolute path of
The first 30 The audio model parameter file path of line is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/AUDIO/__param__ The absolute path of
The first 38 That's ok BMN The weight path of the model is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/BMN/__model__ The absolute path of
The first 39 That's ok BMN The parameter file path of the model is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/BMN/__param__ The absolute path of
The first 51 Yes LSTM The model weight path is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/LSTM/__model__ The absolute path of
The first 52 Yes LSTM The path of model parameter file is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/LSTM/__param__ The absolute path of
And on again PaddleVideo-develop/applications/FootballAction/extractor/extract_feat.py
The first 83 The row path is changed to EuroCup2016 Path to folder :PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016
After the above configuration , Enter into PaddleVideo-develop/applications/FootballAction Run under the directory
python extract_feat.py
After that step , Data storage location
|-- datasets # Training data sets and processing scripts
|-- EuroCup2016 # Data sets
|-- features # Video images + Audio features
Next, use the processed features Training BMN
边栏推荐
- AcWing 904. 虫洞 题解(spfa求负环)
- 【芯片方案设计】脉搏血氧仪
- AcWing 1148. Secret milk transportation problem solution (minimum spanning tree)
- AI 从代码中自动生成注释文档
- [signal and system]
- Wood extraction in Halcon
- 对C语言数组的再认识
- AcWing 1141. LAN problem solving (kruskalkruskal finding the minimum spanning tree)
- shell脚本快速统计项目代码行数
- ClickHouse字段分组聚合、按照任意时间段粒度查询SQL
猜你喜欢
Appium自动化测试基础 — uiautomatorviewer定位工具
Instructions for using the domain analysis tool bloodhound
C language - array
Typical problems of subnet division and super network construction
shell脚本快速统计项目代码行数
设置Wordpress伪静态连接(无宝塔)
LeetCode:1175. 质数排列
Go zero micro service practical series (IX. ultimate optimization of seckill performance)
Appium automation test foundation uiautomatorviewer positioning tool
Today's question -2022/7/4 modify string reference type variables in lambda body
随机推荐
Gin introduction practice
Your cache folder contains root-owned files, due to a bug in npm ERR! previous versions of npm which
Neon Optimization: performance optimization FAQ QA
使用nodejs完成判断哪些项目打包+发版
C language - array
JS es5 peut également créer des constantes?
[advanced C language] 8 written questions of pointer
dvajs的基础介绍及使用
编译命令行终端 swift
hdu 4661 Message Passing(木DP&组合数学)
Let's see how to realize BP neural network in Matlab toolbox
C language instance_ five
AcWing 1148. Secret milk transportation problem solution (minimum spanning tree)
405 method not allowed appears when the third party jumps to the website
机器学习:随机梯度下降(SGD)与梯度下降(GD)的区别与代码实现。
域分析工具BloodHound的使用说明
C language instance_ four
AcWing 346. 走廊泼水节 题解(推公式、最小生成树)
糊涂工具类(hutool)post请求设置body参数为json数据
百度飞将BMN时序动作定位框架 | 数据准备与训练指南 (下)