当前位置:网站首页>Animesr: learnable degradation operator and new real world animation VSR dataset
Animesr: learnable degradation operator and new real world animation VSR dataset
2022-07-01 13:41:00 【I love computer vision】
Official account , Find out CV The beauty of Technology
Xintao Another masterpiece of the big guy team , This paper 『AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos』 For real animation VSR A new animation data set is proposed , In addition, the real-world degradation operator is extended to a learnable operator , stay NIQE And other evaluation indicators SOTA.

Author's unit : tencent PCG ARC laboratory
Thesis link :https://arxiv.org/pdf/2206.07038
01
Watch it
This article summarizes three implementation of animation VSR Three key improvement measures :
The recent real world VSR The degradation of methods mostly uses basic operators without learning ability , As fuzzy 、 Noise and compression . This article suggests starting from the real LQ Learn these basic operators in animation , And the learned operators are added to the degradation process . This basic operation based on neural network can help to better capture the distribution of real degradation .
Established a large-scale HQ Animation dataset AVC, So that the animation VSR Train and evaluate .
An efficient multiscale network structure is studied AnimeSR, It makes use of the efficiency of one-way loop network and the effectiveness of sliding window method , Achieve better performance than previous advanced methods .

02
Method
AVC Data sets
Training set AVC-Train contain 553 A high-quality clip , common 55300 frame . Test set AVC-Test contain 30 A fragment , common 3000 frame . In order to evaluate the method in the actual scene , This paper also constructs a real-world test set AVC-RealLQ, It consists of 44 Low quality fragments , The following figure shows some examples of datasets .

Learnable basic operators in degraded synthesis
For lack of LR-HR Training is right , Recent work has designed degradation models as close to the real world as possible , Then use the degradation model from HR In the process of synthesis LR. The above degradation can be described as n Step :
◦◦
The basic operators in the classical degradation model include fuzzy 、 noise 、 The zoom 、JPEG/FFMPEG Compression etc. . These operators do not have any learning ability , This essentially limits their synthetic ability to degrade the real world , Here's the picture a. The other uses large-scale neural networks and confrontation learning methods to synthesize LR sample .
However , Using a large neural network to learn the whole degradation process and distribution is a challenge . These methods are only effective for a limited range of images , And it usually produces unpleasant artifacts , Here's the picture b.
This paper suggests learning the basic operators for degraded synthesis . Different from using a large network , This paper uses twoorthree convolution layers to train tiny Neural Networks , To capture the main features of real degradation , The neural network is subsequently incorporated into the degradation synthesis process . Neural operators are learnable , And it can synthesize those real degenerates that classical operators cannot simulate . The basic operators that can be learned greatly expand the degenerate space , It can cover more real degradation .

Enter the zoom policy
This article USES the LR-HR Train the basic operators that can be learned in a supervised way . However , Get the real world LQ The video LR-HR It is challenging for training . For real LQ Animation , In this paper, the basic operator is used to train the degenerate model, and the preliminary results are obtained , Here's the picture . As expected , The output is not satisfactory . By using different scaling factors (×1—×0.3) To adjust the size of the input .
Can be observed , As the input resolution decreases , Artifacts gradually decrease . But too much downscaling factor will lead to details / Loss of information . among , By scaling these video samples ×0.5 The input of , A good balance can be achieved between artifact elimination and detail loss . therefore , You can manually select a satisfactory output as a pseudo HR, be called “ Enter the zoom policy ”.

Learnable basic operators
This paper selects several representative real-world LQ Animation to train basic operators that can be learned . First, screen VSR The model performs poorly in the original proportion , But under the appropriate scale factor, it can produce better results LQ video , And determine the best zoom factor for each video . Each paragraph LQ Video capture is about 2000 frame , Enter them into VSR In the network , Get fake HR sample . And then use LR— false HR Basic operators that can be learned for training .
The neural operator is composed of 3 individual 3 × 3 The convolution layer consists of , The dimension of the hidden channel is 64. Use between convolution layers LeakyReLU Activate . This article from different LQ Three basic operators that can be learned are trained in the video , And put them into a pool . At each training iteration, randomly select one from the pool , And incorporate it into the degradation process .
Network architecture
Actual animation VSR The network structure in requires a good balance between performance and efficiency . Current practical models such as Real-ESRGAN and RealBasicVSR Usually a very large network , Processing is very time consuming , Take up a lot of resources . When the existing video super-resolution reaches 4K/8K Resolution time , This shortcoming will become more serious . In practice VSR One way circulation structure is usually used in . However , The lack of subsequent frames hinders the use of time information . Therefore, on the basis of efficient unidirectional structure , This paper further adopts the sliding window structure . The cyclic block receives a sequence of frames .

Pictured above b, In the loop block 10 Multi level design of residual blocks . Use three scales ,×1,×0.5 and ×0.25 These three scales are assigned 5、3 and 2 Block . In this paper AnimeSR Optical flow is not used in , Because the author found from experience that optical flow will not bring significant visual improvement . Besides , The calculation of optical flow also reduces the speed of training and reasoning .
03
experiment
Ablation Experiment
Data sets 、 Degenerate model 、 Multiscale structure and learnable basic operators (LBO) The ablation experiment

Quantitative assessment
The author thinks that NR-IQA Indicators are not always consistent with visual quality , Especially on finer scales , Used MANIQA Than NIQE More in line with the perceived visual quality .

Qualitative assessment

04
summary
This article from the xintao Big guy team ,AnimeSR The main contributions are as follows : From the real LQ Learn degradation operators in animation to better capture the distribution of real degradation ; Built a large-scale HQ Animated video dataset AVC For animation VSR Training and evaluation of ; Effective “ Enter the zoom policy ” Make it possible to learn these neural operations ; An efficient multi-scale network structure is further studied to make AnimeSR Realization SOTA. For the whole article , The author thinks :
Entering a zoom strategy is slightly subjective , Is a more objective screening scheme the content that can be studied later , secondly , Past single item VSR The input of is and , In this paper, the innovation of adding output as a sliding window is slightly demanding , Also as input, whether it is not strictly ‘unidirectional’ What about it ?
In the selection of training set, optical flow is used to filter static scenes , But in VSR It is mentioned in that the effect of using optical flow is poor , There is no experimental or theoretical demonstration in this part . So is it the problem of optical flow itself that leads to the bad effect , Use other alignment methods ? Or is it caused by the single range of motion in the data set ? The author believes that more detailed demonstration is needed .
This article uses more MANIQA As a quantitative evaluation index , Can you add a little more evaluation indicators, such as NRQM、PI、BRISQUE etc. , Besides , Is synthetic data also a kind of real world ?

END
Welcome to join 「 Super resolution 」 Exchange group notes :SR

边栏推荐
- 流量管理技术
- Collation and review of knowledge points of Microcomputer Principle and interface technology - pure manual
- Yan Rong looks at how to formulate a multi cloud strategy in the era of hybrid cloud
- 运行游戏时出现0xc000007b错误的解决方法[通俗易懂]
- The 14th five year plan of China's environmental protection industry and the report on the long-term goals for 2035 Ⓖ 2022 ~ 2028
- leetcode 322. Coin change (medium)
- Analysis report on the development trend and prospect scale of silicon intermediary industry in the world and China Ⓩ 2022 ~ 2027
- Blind box NFT digital collection platform system development (build source code)
- 一文读懂TDengine的窗口查询功能
- China NdYAG crystal market research conclusion and development strategy proposal report Ⓥ 2022 ~ 2028
猜你喜欢

Yan Rong looks at how to formulate a multi cloud strategy in the era of hybrid cloud

SAP intelligent robot process automation (IRPA) solution sharing

清华章毓晋老师新书:2D视觉系统和图像技术(文末送5本)

北斗通信模块 北斗gps模块 北斗通信终端DTU

Qtdeisgner, pyuic detailed use tutorial interface and function logic separation (nanny teaching)

Chen Yu (Aqua) - Safety - & gt; Cloud Security - & gt; Multicloud security

Google Earth engine (GEE) - Global Human Settlements grid data 1975-1990-2000-2014 (p2016)

Colorful five pointed star SVG dynamic web page background JS special effect

9. Use of better scroll and ref

Several models of IO blocking, non blocking, IO multiplexing, signal driven and asynchronous IO
随机推荐
Application of 5g industrial gateway in scientific and technological overload control; off-site joint law enforcement for over limit, overweight and overspeed
Computer network interview knowledge points
Yarn重启applications记录恢复
Kongsong (Xintong Institute) - cloud security capacity building and trend in the digital era
Introduction to topological sorting
【241. 为运算表达式设计优先级】
Analysis report on the development prospect and investment strategic planning of China's wafer manufacturing Ⓔ 2022 ~ 2028
In the next stage of digital transformation, digital twin manufacturer Youyi technology announced that it had completed a financing of more than 300 million yuan
微机原理与接口技术知识点整理复习–纯手打
Yarn restart applications record recovery
[安网杯 2021] REV WP
Build a vc2010 development environment and create a tutorial of "realizing Tetris game in C language"
ArrayList capacity expansion mechanism and thread safety
Benefiting from the Internet, the scientific and technological performance of overseas exchange volume has returned to high growth
2022 · 让我带你Jetpack架构组件从入门到精通 — Lifecycle
新手准备多少钱可以玩期货?农产品可以吗?
Leetcode question 1: sum of two numbers (3 languages)
9. Use of better scroll and ref
终端识别技术和管理技术
启动solr报错The stack size specified is too small,Specify at least 328k