当前位置:网站首页>Baidu flying general BMN timing action positioning framework | data preparation and training guide (Part 1)
Baidu flying general BMN timing action positioning framework | data preparation and training guide (Part 1)
2022-07-07 01:39:00 【Xinxu】
One 、 Introduce
BMN The model is developed by Baidu ,2019 year ActivityNet Winning scheme , In the problem of video action positioning proposal The generation of provides an efficient solution .
In short , The timing action positioning of video is to give a video , Analyze from xxx Seconds to xxx What action is a second , Compared with action recognition, it is necessary to infer the start time and end time of this action , The indicators mainly involve two :(1) Classification accuracy (2) And GT Of IoU.
Project address :
The algorithm is mainly divided into three stages :
(1) Video understanding
PP-TSM, Audio features :VGGish
(2) Timing nomination
BMN
(3) Action classification and positioning
AttentionLSTM
Each stage includes data preparation 、 Training 、 Verification and derivation of reasoning model .
The preparation environment mainly depends on requirements.txt The content inside is installed , Basically no problem ,paddlepaddle-gpu You'd better install the latest version .
Two 、PP-TSM
The dataset uses FootballAction Open source football action data set of flying oars
Data set from EuroCup2012, EuroCup2016, WorldCup2014, WorldCup2018 The competition video of the four events is composed , total 272 Training set 、25 Test set , Support 15 Positioning and recognition of wonderful football moves , The action categories are : A shot 、 goal 、 There are cheers for the goal 、 Corner kick 、 free kick 、 A yellow card 、 The red card 、 A penalty 、 substitutions 、 Out of bounds 、 Goal ball 、 Kick off 、 Flag waving offside 、 Replay air confrontation and replay goals .
In the project, not all the data of the propeller are open source , It's open source altogether 49 Data sets .
(1) Download datasets
Use bash File download , The download script file is located in PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/download_dataset.sh, After giving the file permission to execute, you can run it directly , When the download is complete, it will be in PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/mp4 Under this folder 49 individual MP4 video , total 78.1GB size . The marked data is directly given in the project file :
datasets/EuroCup2016/label.json List of tags for classification
datasets/EuroCup2016/label_cls8_train.json Tag the training data
datasets/EuroCup2016/label_cls8_train.json To validate data labels
datasets/EuroCup2016/url.list List of files for training data
datasets/EuroCup2016/url_val.list To verify the data file list
(2) Prepare the data
In the first stage, you need to prepare PP-TSM Training data , Use the following command :
Before that, there needs to be ffmpeg Environmental Science ,sudo apt install ffmpeg
cd PaddleVideo-develop/applications/FootballAction/datasets/script
python get_frames_pcm.py
This step is to sample the original video file , Image sampling is in seconds 5 The frequency of the frame , Audio sampling is based on 16000 The frequency of . It takes a long time to deal with , After processing, two new folders will be generated :
|-- datasets # Training data sets and processing scripts
|-- EuroCup2016 # Data sets
|-- mp4 # The original video .mp4
|-- frames # image frame ( new )
|-- pcm # Audio pcm( new )
|-- url.list # Video list
|-- label.json # The original video gts
(3) Process sampling
Process the above sampling data into PP-TSM Training data sets for
cd PaddleVideo-develop/applications/FootballAction/datasets/script
python get_instance_for_pptsm.py
This step is to take the motion interval as a positive sample according to the annotation , All frames in the interval generate a pkl file , The non motion interval is taken as a negative sample , Random sampling N Intervals generate N individual pkl file
After that step :
|-- datasets # Training data sets and processing scripts
|-- EuroCup2016 # Data sets
|-- input_for_pptsm # pptsm Training data ( new )
(4) Training PP-TSM
First, you need to download a pre training weight :
cd PaddleVideo-develop/applications/FootballAction
wget https://videotag.bj.bcebos.com/PaddleVideo/PretrainModel/ResNet50_vd_ssld_v2_pretrained.pdparams
mkdir pretrain
mv ResNet50_vd_ssld_v2_pretrained.pdparams pretrain/ResNet50_vd_ssld_v2_pretrained.pdparams
Open the training Profile :
PaddleVideo-develop/applications/FootballAction/train_proposal/configs/pptsm_football_v2.0.yaml
The first 5 That's ok : Write the location of the pre training model just downloaded , Note the absolute path
The first 17,18 That's ok :batchsize size , I am a 2080Ti-8G, Can only write 4/4
The first 19 That's ok : Change it to 1
The first 23 That's ok : find PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/input_for_pptsm/train.list This file , Then write his absolute path
The first 28 That's ok : find PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016/input_for_pptsm/val.list, Then write his absolute path , This is actually what was just (3) The index file generated in that step
The first 33 That's ok : and 28 All right , Just write the same thing
For single card, use the following command to start training :
python -B -m paddle.distributed.launch --gpus="0" --log_dir=./football/logs_pptsm main.py --validate -c applications/FootballAction/train_proposal/configs/pptsm_football_v2.0.yaml -o output_dir=./football/pptsm
Probably need 3 God 3 Night training complete , Next, change the code to reasoning mode :
Before switching to prediction mode , Need modification
PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py
file , take init and infer_step The functions are updated to the following code :
def __init__(self, backbone=None, head=None):
super().__init__(backbone=backbone, head=head)
self.avgpool2d = paddle.nn.AdaptiveAvgPool2D((1, 1), data_format='NCHW')
def infer_step(self, data_batch):
"""Define how the model is going to test, from input to output."""
imgs = data_batch[0]
imgs = paddle.reshape_(imgs, [-1] + list(imgs.shape[2:]))
feature = self.backbone(imgs)
feat = self.avgpool2d(feature)
return feat
stay PaddleVideo Root execution
python tools/export_model.py -c applications/FootballAction/train_proposal/configs/pptsm_football_v2.0.yaml \
-p ./football/pptsm/ppTSM_best.pdparams \
-o ./football/inference_model
The reasoning model can be derived
(5) To configure PP-TSM
take
PaddleVideo/applications/FootballAction/predict/action_detect/models/pptsm_infer.py
In file 41 Yes
self.output_tensor = self.predictor.get_output_handle(output_names[1])
Replace with
self.output_tensor = self.predictor.get_output_handle(output_names[0])
Feature extraction of image and audio , Because we use the weights we just trained to extract features , So you need to modify the configuration file :
stay PaddleVideo-develop/applications/FootballAction/extractor/configs/configs.yaml In this document ,
The first 4 All right index_label_football_8.json The path of is configured as PaddleVideo-develop/applications/FootballAction/extractor/configs/index_label_football_8.json The absolute path of
The first 13 OK, change the default weight road strength to PaddleVideo-develop/football/inference_model/ppTSM.pdmodel The absolute path of
The first 14 Line change the default parameter file to PaddleVideo-develop/football/inference_model/ppTSM.pdiparams The absolute path of
The first 29 The audio model weight path of the row is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/AUDIO/__model__ The absolute path of
The first 30 The audio model parameter file path of line is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/AUDIO/__param__ The absolute path of
The first 38 That's ok BMN The weight path of the model is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/BMN/__model__ The absolute path of
The first 39 That's ok BMN The parameter file path of the model is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/BMN/__param__ The absolute path of
The first 51 Yes LSTM The model weight path is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/LSTM/__model__ The absolute path of
The first 52 Yes LSTM The path of model parameter file is changed to PaddleVideo-develop/applications/FootballAction/checkpoints/LSTM/__param__ The absolute path of
And on again PaddleVideo-develop/applications/FootballAction/extractor/extract_feat.py
The first 83 The row path is changed to EuroCup2016 Path to folder :PaddleVideo-develop/applications/FootballAction/datasets/EuroCup2016
After the above configuration , Enter into PaddleVideo-develop/applications/FootballAction Run under the directory
python extract_feat.py
After that step , Data storage location
|-- datasets # Training data sets and processing scripts
|-- EuroCup2016 # Data sets
|-- features # Video images + Audio features
Next, use the processed features Training BMN
边栏推荐
- Yunna | work order management software, work order management software app
- According to the analysis of the Internet industry in 2022, how to choose a suitable position?
- 鼠标右键 自定义
- 永久的摇篮
- IDEA常用的快捷键
- 云呐|工单管理办法,如何开展工单管理
- AcWing 361. 观光奶牛 题解(spfa求正环)
- THREE. AxesHelper is not a constructor
- Drag to change order
- Set up [redis in centos7.x]
猜你喜欢
Gin 入门实战
AcWing 345. Cattle station solution (nature and multiplication of Floyd)
黑马笔记---异常处理
2022 Google CTF segfault Labyrinth WP
爬虫实战(六):爬笔趣阁小说
永久的摇篮
Gazebo的安装&与ROS的连接
字节P7专业级讲解:接口测试常用工具及测试方法,福利文
Yunna - work order management system and process, work order management specification
Yunna | work order management software, work order management software app
随机推荐
golang 基础 —— 数据类型
grep查找进程时,忽略grep进程本身
子网划分、构造超网 典型题
[advanced C language] 8 written questions of pointer
7.6 simulation summary
修改px4飞控的系统时间
AcWing 1141. LAN problem solving (kruskalkruskal finding the minimum spanning tree)
mysqlbackup 还原特定的表
Go zero micro service practical series (IX. ultimate optimization of seckill performance)
盒子拉伸拉扯(左右模式)
Neon Optimization: performance optimization FAQ QA
糊涂工具类(hutool)post请求设置body参数为json数据
AcWing 345. 牛站 题解(floyd的性质、倍增)
Yunna | work order management software, work order management software app
爬虫实战(六):爬笔趣阁小说
C语言关于链表的代码看不懂?一篇文章让你拿捏二级指针并深入理解函数参数列表中传参的多种形式
Neon Optimization: summary of performance optimization experience
机器学习:随机梯度下降(SGD)与梯度下降(GD)的区别与代码实现。
Comparison of picture beds of free white whoring
454-百度面经1