当前位置:网站首页>Animesr: learnable degradation operator and new real world animation VSR dataset
Animesr: learnable degradation operator and new real world animation VSR dataset
2022-07-01 13:41:00 【I love computer vision】
Official account , Find out CV The beauty of Technology
Xintao Another masterpiece of the big guy team , This paper 『AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos』 For real animation VSR A new animation data set is proposed , In addition, the real-world degradation operator is extended to a learnable operator , stay NIQE And other evaluation indicators SOTA.

Author's unit : tencent PCG ARC laboratory
Thesis link :https://arxiv.org/pdf/2206.07038
01
Watch it
This article summarizes three implementation of animation VSR Three key improvement measures :
The recent real world VSR The degradation of methods mostly uses basic operators without learning ability , As fuzzy 、 Noise and compression . This article suggests starting from the real LQ Learn these basic operators in animation , And the learned operators are added to the degradation process . This basic operation based on neural network can help to better capture the distribution of real degradation .
Established a large-scale HQ Animation dataset AVC, So that the animation VSR Train and evaluate .
An efficient multiscale network structure is studied AnimeSR, It makes use of the efficiency of one-way loop network and the effectiveness of sliding window method , Achieve better performance than previous advanced methods .

02
Method
AVC Data sets
Training set AVC-Train contain 553 A high-quality clip , common 55300 frame . Test set AVC-Test contain 30 A fragment , common 3000 frame . In order to evaluate the method in the actual scene , This paper also constructs a real-world test set AVC-RealLQ, It consists of 44 Low quality fragments , The following figure shows some examples of datasets .

Learnable basic operators in degraded synthesis
For lack of LR-HR Training is right , Recent work has designed degradation models as close to the real world as possible , Then use the degradation model from HR In the process of synthesis LR. The above degradation can be described as n Step :
◦◦
The basic operators in the classical degradation model include fuzzy 、 noise 、 The zoom 、JPEG/FFMPEG Compression etc. . These operators do not have any learning ability , This essentially limits their synthetic ability to degrade the real world , Here's the picture a. The other uses large-scale neural networks and confrontation learning methods to synthesize LR sample .
However , Using a large neural network to learn the whole degradation process and distribution is a challenge . These methods are only effective for a limited range of images , And it usually produces unpleasant artifacts , Here's the picture b.
This paper suggests learning the basic operators for degraded synthesis . Different from using a large network , This paper uses twoorthree convolution layers to train tiny Neural Networks , To capture the main features of real degradation , The neural network is subsequently incorporated into the degradation synthesis process . Neural operators are learnable , And it can synthesize those real degenerates that classical operators cannot simulate . The basic operators that can be learned greatly expand the degenerate space , It can cover more real degradation .

Enter the zoom policy
This article USES the LR-HR Train the basic operators that can be learned in a supervised way . However , Get the real world LQ The video LR-HR It is challenging for training . For real LQ Animation , In this paper, the basic operator is used to train the degenerate model, and the preliminary results are obtained , Here's the picture . As expected , The output is not satisfactory . By using different scaling factors (×1—×0.3) To adjust the size of the input .
Can be observed , As the input resolution decreases , Artifacts gradually decrease . But too much downscaling factor will lead to details / Loss of information . among , By scaling these video samples ×0.5 The input of , A good balance can be achieved between artifact elimination and detail loss . therefore , You can manually select a satisfactory output as a pseudo HR, be called “ Enter the zoom policy ”.

Learnable basic operators
This paper selects several representative real-world LQ Animation to train basic operators that can be learned . First, screen VSR The model performs poorly in the original proportion , But under the appropriate scale factor, it can produce better results LQ video , And determine the best zoom factor for each video . Each paragraph LQ Video capture is about 2000 frame , Enter them into VSR In the network , Get fake HR sample . And then use LR— false HR Basic operators that can be learned for training .
The neural operator is composed of 3 individual 3 × 3 The convolution layer consists of , The dimension of the hidden channel is 64. Use between convolution layers LeakyReLU Activate . This article from different LQ Three basic operators that can be learned are trained in the video , And put them into a pool . At each training iteration, randomly select one from the pool , And incorporate it into the degradation process .
Network architecture
Actual animation VSR The network structure in requires a good balance between performance and efficiency . Current practical models such as Real-ESRGAN and RealBasicVSR Usually a very large network , Processing is very time consuming , Take up a lot of resources . When the existing video super-resolution reaches 4K/8K Resolution time , This shortcoming will become more serious . In practice VSR One way circulation structure is usually used in . However , The lack of subsequent frames hinders the use of time information . Therefore, on the basis of efficient unidirectional structure , This paper further adopts the sliding window structure . The cyclic block receives a sequence of frames .

Pictured above b, In the loop block 10 Multi level design of residual blocks . Use three scales ,×1,×0.5 and ×0.25 These three scales are assigned 5、3 and 2 Block . In this paper AnimeSR Optical flow is not used in , Because the author found from experience that optical flow will not bring significant visual improvement . Besides , The calculation of optical flow also reduces the speed of training and reasoning .
03
experiment
Ablation Experiment
Data sets 、 Degenerate model 、 Multiscale structure and learnable basic operators (LBO) The ablation experiment

Quantitative assessment
The author thinks that NR-IQA Indicators are not always consistent with visual quality , Especially on finer scales , Used MANIQA Than NIQE More in line with the perceived visual quality .

Qualitative assessment

04
summary
This article from the xintao Big guy team ,AnimeSR The main contributions are as follows : From the real LQ Learn degradation operators in animation to better capture the distribution of real degradation ; Built a large-scale HQ Animated video dataset AVC For animation VSR Training and evaluation of ; Effective “ Enter the zoom policy ” Make it possible to learn these neural operations ; An efficient multi-scale network structure is further studied to make AnimeSR Realization SOTA. For the whole article , The author thinks :
Entering a zoom strategy is slightly subjective , Is a more objective screening scheme the content that can be studied later , secondly , Past single item VSR The input of is and , In this paper, the innovation of adding output as a sliding window is slightly demanding , Also as input, whether it is not strictly ‘unidirectional’ What about it ?
In the selection of training set, optical flow is used to filter static scenes , But in VSR It is mentioned in that the effect of using optical flow is poor , There is no experimental or theoretical demonstration in this part . So is it the problem of optical flow itself that leads to the bad effect , Use other alignment methods ? Or is it caused by the single range of motion in the data set ? The author believes that more detailed demonstration is needed .
This article uses more MANIQA As a quantitative evaluation index , Can you add a little more evaluation indicators, such as NRQM、PI、BRISQUE etc. , Besides , Is synthetic data also a kind of real world ?

END
Welcome to join 「 Super resolution 」 Exchange group notes :SR

边栏推荐
- 用命令行 给 apk 签名
- China NdYAG crystal market research conclusion and development strategy proposal report Ⓥ 2022 ~ 2028
- In the next stage of digital transformation, digital twin manufacturer Youyi technology announced that it had completed a financing of more than 300 million yuan
- 2. Sensor size "recommended collection"
- Qtdeisgner, pyuic detailed use tutorial interface and function logic separation (nanny teaching)
- Apache-Atlas-2.2.0 独立编译部署
- Detailed explanation of leetcode reconstruction binary tree [easy to understand]
- 研发效能度量框架解读
- [安网杯 2021] REV WP
- 微机原理与接口技术知识点整理复习–纯手打
猜你喜欢

Beidou communication module Beidou GPS module Beidou communication terminal DTU

JS discolored Lego building blocks

面试题目总结(1) https中间人攻击,ConcurrentHashMap的原理 ,serialVersionUID常量,redis单线程,

French Data Protection Agency: using Google Analytics or violating gdpr

8 popular recommended style layout

洞态在某互联⽹⾦融科技企业的最佳落地实践

SAP 智能机器人流程自动化(iRPA)解决方案分享

终端识别技术和管理技术

Fiori 应用通过 Adaptation Project 的增强方式分享

9. Use of better scroll and ref
随机推荐
04-Redis源码数据结构之字典
Google Earth engine (GEE) - Global Human Settlements grid data 1975-1990-2000-2014 (p2016)
In the next stage of digital transformation, digital twin manufacturer Youyi technology announced that it had completed a financing of more than 300 million yuan
小程序-小程序图表库(F2图表库)
微机原理与接口技术知识点整理复习–纯手打
Content Audit Technology
Yan Rong looks at how to formulate a multi cloud strategy in the era of hybrid cloud
启动solr报错The stack size specified is too small,Specify at least 328k
8款最佳实践,保护你的 IaC 安全!
Example code of second kill based on MySQL optimistic lock
Benefiting from the Internet, the scientific and technological performance of overseas exchange volume has returned to high growth
小程序- view中多个text换行
内容审计技术
当你真的学会DataBinding后,你会发现“这玩意真香”!
Application of 5g industrial gateway in scientific and technological overload control; off-site joint law enforcement for over limit, overweight and overspeed
3.4 data query in introduction to database system - select (single table query, connection query, nested query, set query, multi table query)
Sign APK with command line
分布式事务简介(seata)
焱融看 | 混合云时代下,如何制定多云策略
Flow management technology