当前位置:网站首页>26 FPS video super-resolution model DAP! Output 720p Video Online
26 FPS video super-resolution model DAP! Output 720p Video Online
2022-07-02 20:53:00 【I love computer vision】
Official account , Find out CV The beauty of Technology
Company : Zurich Federal Institute of technology 、 University of Leuven 、 University of vilzburg
The paper :https://arxiv.org/pdf/2202.01731v1.pdf
Editor's speech : Different from at this stage VSR Two hot research directions : real world / blind VSR、VSR With transmission , The author of this paper has made a breakthrough in the direction of super division in the field of real-time online , This is related to IPRRN The starting point of this article is similar to . this paper DAP The effect is comparable EDVR, But it's three times shorter ,180P Video can be accessed online 26FPS!
01
Watch it
VSR The application of has strict causality 、 Real time and other restrictions . There are two challenges : Information for future frames is not available 、 Design an efficient and effective frame alignment and fusion module . In this paper, a deformable attention pyramid is proposed (DAP) The cycle of VSR structure .
DAP Align and integrate the information from the loop state into the current frame prediction . In order to avoid the computational cost of traditional attention-based methods ,DAP Focus only on a limited number of spatial locations , These positions are made by DAP Dynamic prediction . It exceeds... On two benchmarks EDVR-M Method , At a faster rate than 3 times .
02
Method
Overview
According to Nyquist - Shannon's sampling theorem , The frequency band of discrete signals is limited ,VSR The task of the algorithm is to recover high-frequency content higher than the above frequency from low-resolution video . The recursive algorithm in this paper focuses on the fast runtime combined with the update and extraction of information in the hidden state to deal with the alignment between frames .
First , Our encoder network encodes the input frame into a multi-level feature map from fine to coarse , Then the deformable attention module iteratively refines the calculated offset from coarse to fine , Then the fusion module aggregates the hidden state features according to the final offset , Finally, the main processing unit composed of multiple residual information distillation blocks estimates the high-resolution frame and the next hidden state , The frame is shown in the following figure .

DAP
use first U-Net Type encoder calculates multi-level features from and . On the second floor of the pyramid ,k A sampling position is calculated to act as the key position of the upper deformable attention module , The feature of using convolution block to calculate residual offset is based on t-1 To t From the fusion of cross attention , The offset will be optimized repeatedly , until =0, As shown in the figure below , among ⊗ Represents channel superposition ,⊕ Represents the addition of pixels .
Multistage encoder
There is fast motion in the video , In this paper, a multi-level encoder is designed to obtain multi-resolution features . Because there are different spatial views on different resolution frames , This can capture different ranges of motion . The hierarchy is defined as , In this study L=3, Separate processing chains are used for input at different times , The characteristics are calculated as follows :
402 Payment Required
Where means by 4 A convolution block composed of convolutions , Represents bilinear down sampling .#### Deformable note To reduce the complexity of the attention module , In this paper, the search of salient features is limited to the dynamically selected position in the feature graph , Instead of related exhaustive calculation in a large neighborhood or even the whole frame . By calculating only the correlation of dense pixels , The calculation workload is greatly reduced . Where is the feature representation of the current frame , And by dynamically predicting the spatial position and calculating . The calculation is as follows :Where is bilinear upsampling .
Iterative refinement
In each pyramid , The dense offset is iteratively optimized by adding the residual offset to the offset of the previous level using convolution blocks . Used in offset prediction networks 7×7 The kernel of , To ensure intensive calculation under large receptive field , The calculation is as follows :
402 Payment Required
Hidden state fusion
Final , The top-level offset is used in t Always integrate significant hidden state features , Another variable attention block calculates , As shown below :
In addition, the internal tensors are grouped and sampled at all stages of the runtime , According to the sampling key / It's worth it k=4 Select the number of groups .
03
experiment
Ablation Experiment
Ablation experiments with different components and channel numbers :
One of the core features of the most advanced two-way method is the ability to fuse information offline in the whole video . This naturally includes aggregation in reverse chronological order . Because this paper studies forward / Differences between backward assessments . It's amazing , Reverse chronological aggregation significantly improves performance .
The authors attribute this gain to the fact that forward motion of the camera is more common in video . If the object moves towards the camera , Or vice versa , Then they first appear in high resolution , This simplifies the super-resolution of these objects . therefore , Having the opportunity to reverse process video may improve VSR Performance of , Thus, the non causal method has more advantages than the online algorithm .
Quantitative assessment
stay REDS4、UDM10、Viemo-90K Quantitative evaluation on :
Qualitative assessment
stay REDS Qualitative assessment on :
END
Welcome to join 「 Super resolution 」 Exchange group notes :SR
边栏推荐
- Longest public prefix of leetcode
- Roommate, a king of time, I took care of the C language structure memory alignment
- Exemple complet d'enregistrement du modèle pytoch + enregistrement du modèle pytoch seuls les paramètres d'entraînement sont - ils enregistrés? Oui (+ Solution)
- Research Report on the overall scale, major manufacturers, major regions, products and applications of capacitive voltage transformers in the global market in 2022
- Lantern Festival, come and guess lantern riddles to win the "year of the tiger Doll"!
- 测试人员如何做不漏测?这7点就够了
- Research Report on the overall scale, major manufacturers, major regions, products and application segmentation of signal distributors in the global market in 2022
- C language linked list -- to be added
- Research Report on the overall scale, major manufacturers, major regions, products and applications of battery control units in the global market in 2022
- 功能、作用、效能、功用、效用、功效
猜你喜欢
In depth understanding of modern web browsers (I)
Properties of expectation and variance
Wu Enda's machine learning mind mapping insists on clocking in for 23 days - building a knowledge context, reviewing, summarizing and replying
八年测开经验,面试28K公司后,吐血整理出高频面试题和答案
burp 安装 license key not recognized
接口测试到底怎么做?看完这篇文章就能清晰明了
「 工业缺陷检测深度学习方法」最新2022研究综述
The metamask method is used to obtain account information
[cloud native topic -49]:kubesphere cloud Governance - operation - step by step deployment of microservice based business applications - basic processes and steps
JDBC | Chapter 4: transaction commit and rollback
随机推荐
After eight years of test experience and interview with 28K company, hematemesis sorted out high-frequency interview questions and answers
SBT tutorial
面试经验总结,为你的offer保驾护航,满满的知识点
Resunnet - tensorrt8.2 Speed and Display record Sheet on Jetson Xavier NX (continuously supplemented)
pytorch 模型保存的完整例子+pytorch 模型保存只保存可训练参数吗?是(+解决方案)
Welfare | Pu Aries | liv heart co branded Plush surrounding new products are on the market!
1005 spell it right (20 points) "PTA class a exercise"
Cron expression (seven subexpressions)
7. Build native development environment
Properties of expectation and variance
at编译环境搭建-win
Customized Huawei hg8546m restores Huawei's original interface
数据库模式笔记 --- 如何在开发中选择合适的数据库+关系型数据库是谁发明的?
Complete example of pytorch model saving +does pytorch model saving only save trainable parameters? Yes (+ solution)
Research Report on the overall scale, major manufacturers, major regions, products and application segmentation of power management units in the global market in 2022
[cloud native topic -49]:kubesphere cloud Governance - operation - step by step deployment of microservice based business applications - basic processes and steps
「 工业缺陷检测深度学习方法」最新2022研究综述
How to realize the function of detecting browser type in Web System
【QT】QPushButton创建
[internship] solve the problem of too long request parameters