当前位置:网站首页>Reference frame generation based on deep learning

Reference frame generation based on deep learning

2022-07-06 20:56:00 Dillon2015

This article comes from the proposal JVET-T0058 and JVET-U0087, This method generates virtual reference frames for inter frame prediction by inserting frames . The whole model consists of several sub models , Perform optical flow estimation respectively 、 Compensation and detail enhancement .

The overall architecture

The overall architecture is as follows Fig.1 Shown , In the process of video coding DPB There is a reference frame for motion estimation , according to GOP Structure the current frame has one or more forward 、 Backward reference frame . The default in the proposal is POC The two reference frames closest to the current frame generate a virtual reference frame , Such as Fig.1 Current frame in POC yes 5, Then use POC by 4 and 6 The frame of generates a reference frame . The generated virtual reference frame will be put into DPB For reference , Virtual reference frame POC Set to the same as the current frame . In order to prevent affecting the time domain MVP According to the POC Distant MV Zoom process , Virtual reference frame MV All set to 0 And is used as a long-term reference frame . In the proposal , After the current frame is decoded, the virtual reference frame starts from DPB Remove .

For high resolution sequences (4K or 8K) Due to resource constraints, neural network processing cannot be directly used for the whole frame , At this time, it is assumed that the virtual reference frame is divided into multiple regions , Each area uses network generation separately , Then put these areas together into a reference frame .

A network model


Optical flow estimation and compensation are mostly used in general video interpolation , Generally, bidirectional optical flow method is used , Then the two optical flows are combined into one through a linear model . Only the single optical flow model is used in the proposal .

Such as Fig.2, First, optical flow is generated by optical flow estimation model ( Input is POC The two nearest reference frames ), And then through backward warping Process processing optical flow , The processed optical flow and two reference frames pass through fusion Process synthesis intermediate frame . The intermediate frame will enhance the quality of the model through details , The detail enhancement model consists of two parts ,PCD(Pyramid, Cascading and Deformable) For space-time optimization and TSA (Temporal and Spatial Attention) Used to improve important features attention.

experimental result

Interested parties, please pay attention to WeChat official account Video Coding

原网站

版权声明
本文为[Dillon2015]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207061245460662.html