当前位置:网站首页>26 FPS video super-resolution model DAP! Output 720p Video Online
26 FPS video super-resolution model DAP! Output 720p Video Online
2022-07-02 20:53:00 【I love computer vision】
Official account , Find out CV The beauty of Technology

Company : Zurich Federal Institute of technology 、 University of Leuven 、 University of vilzburg
The paper :https://arxiv.org/pdf/2202.01731v1.pdf
Editor's speech : Different from at this stage VSR Two hot research directions : real world / blind VSR、VSR With transmission , The author of this paper has made a breakthrough in the direction of super division in the field of real-time online , This is related to IPRRN The starting point of this article is similar to . this paper DAP The effect is comparable EDVR, But it's three times shorter ,180P Video can be accessed online 26FPS!
01
Watch it
VSR The application of has strict causality 、 Real time and other restrictions . There are two challenges : Information for future frames is not available 、 Design an efficient and effective frame alignment and fusion module . In this paper, a deformable attention pyramid is proposed (DAP) The cycle of VSR structure .
DAP Align and integrate the information from the loop state into the current frame prediction . In order to avoid the computational cost of traditional attention-based methods ,DAP Focus only on a limited number of spatial locations , These positions are made by DAP Dynamic prediction . It exceeds... On two benchmarks EDVR-M Method , At a faster rate than 3 times .

02
Method
Overview
According to Nyquist - Shannon's sampling theorem , The frequency band of discrete signals is limited ,VSR The task of the algorithm is to recover high-frequency content higher than the above frequency from low-resolution video . The recursive algorithm in this paper focuses on the fast runtime combined with the update and extraction of information in the hidden state to deal with the alignment between frames .
First , Our encoder network encodes the input frame into a multi-level feature map from fine to coarse , Then the deformable attention module iteratively refines the calculated offset from coarse to fine , Then the fusion module aggregates the hidden state features according to the final offset , Finally, the main processing unit composed of multiple residual information distillation blocks estimates the high-resolution frame and the next hidden state , The frame is shown in the following figure .

DAP
use first U-Net Type encoder calculates multi-level features from and . On the second floor of the pyramid ,k A sampling position is calculated to act as the key position of the upper deformable attention module , The feature of using convolution block to calculate residual offset is based on t-1 To t From the fusion of cross attention , The offset will be optimized repeatedly , until =0, As shown in the figure below , among ⊗ Represents channel superposition ,⊕ Represents the addition of pixels .

Multistage encoder
There is fast motion in the video , In this paper, a multi-level encoder is designed to obtain multi-resolution features . Because there are different spatial views on different resolution frames , This can capture different ranges of motion . The hierarchy is defined as , In this study L=3, Separate processing chains are used for input at different times , The characteristics are calculated as follows :
402 Payment Required
Where means by 4 A convolution block composed of convolutions , Represents bilinear down sampling .#### Deformable note To reduce the complexity of the attention module , In this paper, the search of salient features is limited to the dynamically selected position in the feature graph , Instead of related exhaustive calculation in a large neighborhood or even the whole frame . By calculating only the correlation of dense pixels , The calculation workload is greatly reduced . Where is the feature representation of the current frame , And by dynamically predicting the spatial position and calculating . The calculation is as follows :Where is bilinear upsampling .
Iterative refinement
In each pyramid , The dense offset is iteratively optimized by adding the residual offset to the offset of the previous level using convolution blocks . Used in offset prediction networks 7×7 The kernel of , To ensure intensive calculation under large receptive field , The calculation is as follows :
402 Payment Required
Hidden state fusion
Final , The top-level offset is used in t Always integrate significant hidden state features , Another variable attention block calculates , As shown below :
In addition, the internal tensors are grouped and sampled at all stages of the runtime , According to the sampling key / It's worth it k=4 Select the number of groups .
03
experiment
Ablation Experiment
Ablation experiments with different components and channel numbers :

One of the core features of the most advanced two-way method is the ability to fuse information offline in the whole video . This naturally includes aggregation in reverse chronological order . Because this paper studies forward / Differences between backward assessments . It's amazing , Reverse chronological aggregation significantly improves performance .
The authors attribute this gain to the fact that forward motion of the camera is more common in video . If the object moves towards the camera , Or vice versa , Then they first appear in high resolution , This simplifies the super-resolution of these objects . therefore , Having the opportunity to reverse process video may improve VSR Performance of , Thus, the non causal method has more advantages than the online algorithm .

Quantitative assessment
stay REDS4、UDM10、Viemo-90K Quantitative evaluation on :

Qualitative assessment
stay REDS Qualitative assessment on :


END
Welcome to join 「 Super resolution 」 Exchange group notes :SR

边栏推荐
- 【Hot100】22. 括号生成
- 【871. 最低加油次数】
- at编译环境搭建-win
- 6 pyspark Library
- Research Report on the overall scale, major manufacturers, major regions, products and application segmentation of the inverted front fork of the global market in 2022
- Select function
- Jetson XAVIER NX上ResUnet-TensorRT8.2速度與顯存記錄錶(後續不斷補充)
- [JS] get the search parameters of URL in hash mode
- CRM Customer Relationship Management System
- Is it safe to buy funds on securities accounts? Where can I buy funds
猜你喜欢

I drew a Gu ailing with characters!

数据库模式笔记 --- 如何在开发中选择合适的数据库+关系型数据库是谁发明的?

CRM Customer Relationship Management System

疫情封控65天,我的居家办公心得分享 | 社区征文

burp 安装 license key not recognized

【871. 最低加油次数】

【QT】QPushButton创建

【实习】解决请求参数过长问题

Volvo's first MPV is exposed! Comfortable and safe, equipped with 2.0T plug-in mixing system, it is worth first-class

JASMINER X4 1U deep disassembly reveals the secret behind high efficiency and power saving
随机推荐
This team with billions of data access and open source dreams is waiting for you to join
CS5268完美代替AG9321MCQ Typec多合一扩展坞方案
Research Report on the overall scale, major manufacturers, major regions, products and application segmentation of power management units in the global market in 2022
Second hand housing data analysis and prediction system
SBT tutorial
JS modularization
Wu Enda's machine learning mind mapping insists on clocking in for 23 days - building a knowledge context, reviewing, summarizing and replying
Number of DP schemes
[source code analysis] model parallel distributed training Megatron (5) -- pipestream flush
I drew a Gu ailing with characters!
【Hot100】23. 合并K个升序链表
【Kubernetes系列】kubeadm reset初始化前后空间、内存使用情况对比
Resunet tensorrt8.2 speed and video memory record table on Jetson Xavier NX (continuously supplemented later)
在券商账户上买基金安全吗?哪里可以买基金
台湾SSS鑫创SSS1700替代Cmedia CM6533 24bit 96KHZ USB音频编解码芯片
Research Report on the overall scale, major manufacturers, major regions, products and applications of battery control units in the global market in 2022
[cloud native topic -50]:kubesphere cloud Governance - operation - step by step deployment of microservice based business applications - database middleware MySQL microservice deployment process
Friends who firmly believe that human memory is stored in macromolecular substances, please take a look
What are the preferential account opening policies of securities companies now? Is it actually safe to open an account online?
Research Report on the overall scale, major manufacturers, major regions, products and application segmentation of multi-channel signal conditioners in the global market in 2022