当前位置：网站首页>Article reproduction: super resolution network fsrcnn

Article reproduction: super resolution network fsrcnn

2022-07-28 01:38:00 【GIS and climate】

SRCNN Through three convolution layers to complete the super division of the image , Feature extraction 、 Feature mapping and image reconstruction .

But its input needs to be pre interpolated into the target size , There are certain defects in speed , It is difficult to apply to real-time research , therefore Dong wait forsomeone （SRCNN The first author of ） Improved again SRCNN, Hence the FSRCNN, Among them F representative Fast.

Summary of the original

As a successful deep model applied in image super-resolution (SR), the Super-Resolution Convolutional Neural Network (SRCNN) has demonstrated superior performance to the previous hand- crafted models either in speed and restoration quality. However, the high computational cost still hinders it from practical usage that demands real-time performance (24 fps). In this paper, we aim at accelerating the current SRCNN, and propose a compact hourglass-shape CNN struc- ture for faster and better SR. We re-design the SRCNN structure mainly in three aspects. First, we introduce a deconvolution layer at the end of the network, then the mapping is learned directly from the original low-resolution image (without interpolation) to the high-resolution one. Second, we reformulate the mapping layer by shrinking the input feature dimension before mapping and expanding back afterwards. Third, we adopt smaller filter sizes but more mapping layers. The proposed model achieves a speed up of more than 40 times with even superior restora- tion quality. Further, we present the parameter settings that can achieve real-time performance on a generic CPU while still maintaining good performance. A corresponding transfer strategy is also proposed for fast training and testing across different upscaling factors.

Its new network architecture is ：

For the original SRCNN, Its improvements mainly include ：

At the end of the network, a deconvolution layer , That is, it is used for sampling , In this way, the input image does not need to be interpolated first , You can input LR Images , Then get HR Images ;
For the middle feature mapping layer , First shrink the feature dimension , Then expand it ;
With a smaller convolution kernel 、 More feature mapping layers ;
The activation function uses ReLU, That is to say PReLU;

The model code

Talked about before , use Pytorch Writing code is building blocks , Just look at the network structure and write it a little , The following line is clearer （ From reference link 【2】）：

Its structure is like an hourglass , It is symmetrical on the whole , Both ends are thick , Fine in the middle .

Interestingly, the new structure looks like an hourglass, which is symmetri- cal on the whole, thick at the ends, and thin in the middle.

class FSRCNN(nn.Module):
    def __init__(self, inchannels):
        super(FSRCNN, self).__init__()

        self.features = nn.Sequential(
            nn.Conv2d(in_channels=inchannels, out_channels=64, kernel_size=5, stride=1, padding=2, padding_mode='replicate'),
            nn.PReLU()
        )

        self.shrinking = nn.Sequential(
            nn.Conv2d(in_channels=64,out_channels=32,kernel_size=1, stride=1, padding=0, padding_mode='replicate'),
            nn.PReLU()
        )

        self.mapping = nn.Sequential(
            nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
            nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
            nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
            nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
            nn.PReLU()
        )

        self.expanding = nn.Sequential(
            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=1, stride=1, padding=0),
            nn.PReLU()
        )

        self.deconv = nn.Sequential(
            nn.ConvTranspose2d(in_channels=64, out_channels=inchannels, kernel_size=9, stride=3, padding=4, padding_mode='replicate')

        )

    def forward(self, x):
        x = self.features(x)
        x = self.shrinking(x)
        x = self.mapping(x)
        x = self.expanding(x)
        x = self.deconv(x)

        return x

Code interpretation

On the whole ,FSRCNN It is divided into the above 5 Parts of ：

features Used from the original LR Extract features from images , For the characteristics of the image , Generally, it is extracted by convolution . Notice a sensitive parameter here （ The original text says sensitive variable, I understand that parameters can be changed according to your own needs ） Namely filter The number of , That is, the convolution output of this layer channel Number .
shrinking According to the original author, the function of this part is to reduce the amount of calculation , Personally, I think it's compression Latitude of characteristics , For example, in my code above, I put 64 The feature of dimension is compressed to 32 weft , Naturally, the amount of calculation is reduced .
mapping This layer is exactly called nonlinear mapping ,Non-linear Mapping, It is through Roll roll , Nonlinear transformation of features . According to the author , This part is the most important part for the model results . The following article said this .
expanding This layer is to restore the compressed feature dimension .
deconvolution deconvolution , This layer is to restore the processed features above to HR Images .

A small summary

There are several parameters in the model that can be changed according to your own research , Include ：

Number of feature maps ： It is the output of the first convolution layer channel Count
Compressed feature dimension ： How many dimensions are compressed from the above feature dimension ？
Number of nonlinear mapping layers ： Is in the mapping That module should have several convolution layers （ Note that this does not change channel Count ）
The multiple of deconvolution ： Is how many times you want to enlarge the image

The first four modules of the whole model generally play a role from low score LR The role of extracting features from the image , Then the last module is to upsample the image , You can modify the last module according to your different up sampling multiple requirements , So the author says it is better to transfer learning .
The original author said padding The way to the final result is negligible , Therefore, according to the size of convolution kernel, the corresponding 0 fill , But I think this step can be adjusted according to my own research padding The way , Especially if you pay more attention to pixel values .
In order to train faster , The original author also used 2 Step training , First use 91-image dataset Training , When it converges （satureated？） Then add General-100 The dataset goes on fine-tuning.
For convolution layer lr=10e-3, The final deconvolution layer is lr=10e-4.
Several parameters used in the above code （sensitive variable） Inconsistent with the original .

reflection

The so-called feature extraction , What exactly is extracted ？
The so-called nonlinear mapping , What has been done ？

Doubt and attention

The nonlinear mapping layer in the structure in the original text should be m individual conv The layer is followed by an activation function PReLU, But I think many people on the Internet write about everyone conv There is one behind each layer PReLU, This difference needs to be tested ...

2. The last parameter of deconvolution layer should be calculated carefully according to the multiple of output .

Reference resources

【1】DONG C, LOY C C, TANG X. Accelerating the Super-Resolution Convolutional Neural Network[C]//Computer Vision – ECCV 2016.Springer International Publishing,2016:391-407. 10.1007/978-3-319-46475-6_25.
【2】https://towardsdatascience.com/review-fsrcnn-super-resolution-80ca2ee14da4
【3】https://github.com/Lornatang/FSRCNN-PyTorch

原网站

版权声明
本文为[GIS and climate]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/209/202207272331172691.html