当前位置：网站首页>Simple SR: best buddy Gans for highly detailed image super resolution

Simple SR: best buddy Gans for highly detailed image super resolution

2022-06-13 04:59:00 【Come on, Dangdang】

What problem to solve ？

Problem solved ： How to generate realistic hyperimage ;
To be specific ：
At present, many related methods are used in PSNR、SSIM And other indicators have achieved good results , But most methods may produce blurred visual effects ;
In order to improve the perceptual quality of the restored image , Some methods use counter learning and perceptual loss , But by one-on-one MSE / MAE Excessive smoothing caused by loss , It has not yet been optimally resolved .
also , Confrontation training can produce visual artifacts ;

A method for super-resolution of high quality images is proposed Beby-GAN ;
The network treats smooth and well textured areas in different ways , And only the latter is trained in confrontation . This separation encourages the network to pay more attention to areas with rich details , Also avoid flat areas （ Such as sky and buildings ） Generate unnecessary textures on ;
Proposed one-to-many best-buddy loss( One to many best partner loss ) The benefit is to produce richer and more reasonable textures ;

The main body of the framework is based on the generation of confrontation network (GAN) above , The generator is used to reconstruct high-resolution images , The discriminator is trained to distinguish the restored result from the real natural image ;

Use pre trained RRDB Model as our generator , Because it has a strong learning ability ;

RRDB come from ESRGAN：

The main body of the framework is based on the generation of confrontation network (GAN) [9] above , The generator is used to reconstruct high-resolution images , The discriminator is trained to distinguish the restored result from the real natural image ;

Use pre trained RRDB Model as our generator , Because it has a strong learning ability ;

Why put forward Best-Buddy Loss：

The reason a ： In super-resolution tasks , Single LR Patch In essence, it is related to multiple nature HR Solutions are associated ：

Reason two ： Existing methods usually focus on using in the training phase MSE/MAE Loss to learn the immutable single LR- single HR mapping , This ignores SISR The inherent uncertainty , therefore , Reconstructed HR The image may lack several high-frequency structures ;
The high-frequency structure here can be understood as the details of the image ;

So put forward Best-Buddy Loss：
one-to-many best-buddy loss( One to many best partner loss ) , For reliable and more flexible supervision ;
The key idea is ： Allow estimation by different objectives to be supervised in different iterations HR Patch ;

Candidate Patch How to find ？

First of all, for ground-truth（GT）HR Images IHR Take the next sample ：
among S(I, r) : Is a bicubic subsampling operator , Get one 3 Level image pyramid （ Including the original GT HR Images ）;
then , It will be estimated that HR Images and corresponding GT The image pyramid expands into blocks ,GT A supervision candidate database of the image is partially formed G ;

Best-buddy Patch How to find ？

Estimated HR patch, Find the corresponding supervision patch in the current iteration g∗i（ The best partner ） To satisfy two constraints ：
g∗i（ The best partner ） Need to get close to HR Predefined real values in space gi（ The first term in the equation ）. Rely on the multi-scale self similarity that is common in natural images , It is possible to find a value close to the real value gi Of HR Patch .
To make optimization easier ,g∗i （ The best partner ） It should be close to the estimated value （ The second term in the equation ）. Considered a reasonable prediction , Because the generator of is well initialized .

among α ≥ 0 and β ≥ 0 It's the scaling parameter .

Best-Buddy Loss

The Patch Of Best-Buddy Loss by ：
When β Far less than α when , Best-Buddy Loss Degenerate to MAE Loss;

Backprojection constraint

Use the back projection constraint on the generated estimated picture ：
The reduced super-resolution image must match the expected fidelity at the lower resolution ; Introduced HR-to-LR operation （ In this paper, it is bicubic down sampling ）, To ensure that the estimated HR The image is in LR The projection in space is still the same as the original LR Agreement ;

among s Is the reduction factor .

Yes （w/） And no （w/o） Back projection （BP） Comparison of losses . visualization The relationship between the estimated results and the actual ground conditions L2 Error heat map .

Note that this backprojection loss plays a crucial role in maintaining content and color consistency .

Regional perception confronts learning

The past is based on GAN Methods , Especially in flat areas , Sometimes bad texture will be produced ;

therefore According to local pixel statistics, texture rich regions and smooth regions are distinguished , And only the texture content is provided Give the discriminator , Because there is no need to smooth the region without the image GAN It can also recover well ;

The strategy is to first put the real HR Images （ namely ） Expand to size k^2 The block , Then calculate the standard deviation of each block （std）. Get a binary mask of ：

among δ Is a predefined threshold , (i, j) Is the patch location .

The height texture area is marked as 1, The flat area is marked as 0.

Then estimate the result and groundtruth IHR Same mask as M Multiply , Processed by the following discriminators ;

Although you can use more computational or more complex strategies , But the author proved the effectiveness of region perception antagonism learning through ablation experiments ;

Without region aware learning , There are unpleasant artifacts near the characters and railings in the results （ See “w/o RA”）. After distinguishing the rich texture area from the flat area , This problem has been alleviated , As the first 3 Column （ See “ our ”）;

This separation allows the network to know “ Where is the ” Conduct confrontational training , And produce two main advantages . One side , Because the network only needs to pay attention to the high-frequency details , So training is easier . On the other hand , Smooth part does not pass GAN, The network produces fewer unnatural textures ;

This module guides the model to generate realistic fine details for the texture area ;