当前位置:网站首页>26 FPS video super-resolution model DAP! Output 720p Video Online
26 FPS video super-resolution model DAP! Output 720p Video Online
2022-07-02 20:53:00 【I love computer vision】
Official account , Find out CV The beauty of Technology

Company : Zurich Federal Institute of technology 、 University of Leuven 、 University of vilzburg
The paper :https://arxiv.org/pdf/2202.01731v1.pdf
Editor's speech : Different from at this stage VSR Two hot research directions : real world / blind VSR、VSR With transmission , The author of this paper has made a breakthrough in the direction of super division in the field of real-time online , This is related to IPRRN The starting point of this article is similar to . this paper DAP The effect is comparable EDVR, But it's three times shorter ,180P Video can be accessed online 26FPS!
01
Watch it
VSR The application of has strict causality 、 Real time and other restrictions . There are two challenges : Information for future frames is not available 、 Design an efficient and effective frame alignment and fusion module . In this paper, a deformable attention pyramid is proposed (DAP) The cycle of VSR structure .
DAP Align and integrate the information from the loop state into the current frame prediction . In order to avoid the computational cost of traditional attention-based methods ,DAP Focus only on a limited number of spatial locations , These positions are made by DAP Dynamic prediction . It exceeds... On two benchmarks EDVR-M Method , At a faster rate than 3 times .

02
Method
Overview
According to Nyquist - Shannon's sampling theorem , The frequency band of discrete signals is limited ,VSR The task of the algorithm is to recover high-frequency content higher than the above frequency from low-resolution video . The recursive algorithm in this paper focuses on the fast runtime combined with the update and extraction of information in the hidden state to deal with the alignment between frames .
First , Our encoder network encodes the input frame into a multi-level feature map from fine to coarse , Then the deformable attention module iteratively refines the calculated offset from coarse to fine , Then the fusion module aggregates the hidden state features according to the final offset , Finally, the main processing unit composed of multiple residual information distillation blocks estimates the high-resolution frame and the next hidden state , The frame is shown in the following figure .

DAP
use first U-Net Type encoder calculates multi-level features from and . On the second floor of the pyramid ,k A sampling position is calculated to act as the key position of the upper deformable attention module , The feature of using convolution block to calculate residual offset is based on t-1 To t From the fusion of cross attention , The offset will be optimized repeatedly , until =0, As shown in the figure below , among ⊗ Represents channel superposition ,⊕ Represents the addition of pixels .

Multistage encoder
There is fast motion in the video , In this paper, a multi-level encoder is designed to obtain multi-resolution features . Because there are different spatial views on different resolution frames , This can capture different ranges of motion . The hierarchy is defined as , In this study L=3, Separate processing chains are used for input at different times , The characteristics are calculated as follows :
402 Payment Required
Where means by 4 A convolution block composed of convolutions , Represents bilinear down sampling .#### Deformable note To reduce the complexity of the attention module , In this paper, the search of salient features is limited to the dynamically selected position in the feature graph , Instead of related exhaustive calculation in a large neighborhood or even the whole frame . By calculating only the correlation of dense pixels , The calculation workload is greatly reduced . Where is the feature representation of the current frame , And by dynamically predicting the spatial position and calculating . The calculation is as follows :Where is bilinear upsampling .
Iterative refinement
In each pyramid , The dense offset is iteratively optimized by adding the residual offset to the offset of the previous level using convolution blocks . Used in offset prediction networks 7×7 The kernel of , To ensure intensive calculation under large receptive field , The calculation is as follows :
402 Payment Required
Hidden state fusion
Final , The top-level offset is used in t Always integrate significant hidden state features , Another variable attention block calculates , As shown below :
In addition, the internal tensors are grouped and sampled at all stages of the runtime , According to the sampling key / It's worth it k=4 Select the number of groups .
03
experiment
Ablation Experiment
Ablation experiments with different components and channel numbers :

One of the core features of the most advanced two-way method is the ability to fuse information offline in the whole video . This naturally includes aggregation in reverse chronological order . Because this paper studies forward / Differences between backward assessments . It's amazing , Reverse chronological aggregation significantly improves performance .
The authors attribute this gain to the fact that forward motion of the camera is more common in video . If the object moves towards the camera , Or vice versa , Then they first appear in high resolution , This simplifies the super-resolution of these objects . therefore , Having the opportunity to reverse process video may improve VSR Performance of , Thus, the non causal method has more advantages than the online algorithm .

Quantitative assessment
stay REDS4、UDM10、Viemo-90K Quantitative evaluation on :

Qualitative assessment
stay REDS Qualitative assessment on :


END
Welcome to join 「 Super resolution 」 Exchange group notes :SR

边栏推荐
- 【Hot100】22. 括号生成
- Web3js method to obtain account information and balance
- Burp install license key not recognized
- 股票开户要找谁?手机开户是安全么?
- Second hand housing data analysis and prediction system
- Sometimes only one line of statements are queried, and the execution is slow
- After eight years of test experience and interview with 28K company, hematemesis sorted out high-frequency interview questions and answers
- pytorch 模型保存的完整例子+pytorch 模型保存只保存可训练参数吗?是(+解决方案)
- [cloud native topic -49]:kubesphere cloud Governance - operation - step by step deployment of microservice based business applications - basic processes and steps
- An analysis of the past and present life of the meta universe
猜你喜欢

Outsourcing for three years, abandoned
![[fluent] dart technique (independent main function entry | nullable type determination | default value setting)](/img/cc/3e4ff5cb2237c0f2007c61db1c346d.jpg)
[fluent] dart technique (independent main function entry | nullable type determination | default value setting)

Why do I have a passion for process?

GCC: Graph Contrastive Coding for Graph Neural NetworkPre-Training

5 environment construction spark on yarn

After 65 days of closure and control of the epidemic, my home office experience sharing | community essay solicitation

Interested parties add me for private chat

Data preparation for behavior scorecard modeling

笔记本安装TIA博途V17后出现蓝屏的解决办法

Implementing yolox from scratch: dataset class
随机推荐
Internal/validators js:124 throw new ERR_ INVALID_ ARG_ Type (name, 'string', value) -- solution
Number of DP schemes
Why do I have a passion for process?
Research Report on the overall scale, major manufacturers, major regions, products and applications of swivel chair gas springs in the global market in 2022
What are the preferential account opening policies of securities companies now? Is it actually safe to open an account online?
2021 v+ Quanzhen internet global innovation and Entrepreneurship Challenge, one of the top ten audio and video scene innovation and application pioneers
How to open an account online? Is it safe to open a mobile account?
Longest public prefix of leetcode
JDBC | Chapter 3: SQL precompile and anti injection crud operation
After 65 days of closure and control of the epidemic, my home office experience sharing | community essay solicitation
「 工业缺陷检测深度学习方法」最新2022研究综述
In the era of consumer Internet, a few head platforms have been born
Interpretation of some papers published by Tencent multimedia laboratory in 2021
The metamask method is used to obtain account information
【JS】获取hash模式下URL的搜索参数
Sometimes only one line of statements are queried, and the execution is slow
Review of the latest 2022 research on "deep learning methods for industrial defect detection"
在券商账户上买基金安全吗?哪里可以买基金
Driverless learning (4): Bayesian filtering
什么叫在线开户?现在网上开户安全么?