当前位置:网站首页>Image super resolution using deep revolutionary networks (srcnn) interpretation and Implementation

Image super resolution using deep revolutionary networks (srcnn) interpretation and Implementation

2022-07-06 03:27:00 leon. shadow

Image super-resolution using deep convolutional networks(SRCNN)

One 、 summary

Network structure

 Insert picture description here
SRCNN The network structure is relatively simple , It is a three-layer convolution network , Activation function selection Relu.

  • The first convolution : Realize the extraction of image features .( The number of convolution kernels is 64, The size is 9)
  • The second convolution : Nonlinear mapping of features extracted from the first layer convolution .( The number of convolution kernels is 32, The size is 1[ original text ])
  • The third convolution : Reconstruct the mapped features , Generate high resolution images ..( The number of convolution kernels is 1, The size is 5)
The evaluation index

PSNR( Peak signal to noise ratio ):PSNR The bigger the value is. , The better the reconstruction .

import numpy
import math

def psnr(img1, img2):
    mse = numpy.mean( (img1 - img2) ** 2 )
    if mse == 0:
        return 100
    PIXEL_MAX = 255.0
    return 20 * math.log10(PIXEL_MAX / math.sqrt(mse))
Why only train YCbCr Of Y passageway ?

The image is converted into YCbCr Color space , Although the network only uses brightness channels (Y). then , The output of the network merges interpolated CbCr passageway , Output the final color image . We chose this step because we are not interested in color changes ( Stored in CbCr Information in the channel ) But only its brightness (Y passageway ); The fundamental reason is that compared with color difference , Human vision is more sensitive to brightness changes .

Loss function

The loss function is the mean square error (MSE)

1×1 The function of convolution ?
  1. Realize the change of dimension ( Increase or decrease dimension )
  2. Realize cross channel interaction and information integration
  3. Reduce computation
  4. It can achieve the effect equivalent to the full connection layer

Two 、 Code

model.py
from torch import nn


class SRCNN(nn.Module):
    def __init__(self, num_channels=1):
        super(SRCNN, self).__init__()
        self.conv1 = nn.Conv2d(num_channels, 64, kernel_size=9, padding=9 // 2)
        self.conv2 = nn.Conv2d(64, 32, kernel_size=5, padding=5 // 2)
        self.conv3 = nn.Conv2d(32, num_channels, kernel_size=5, padding=5 // 2)
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.relu(self.conv2(x))
        x = self.conv3(x)
        return x

In order to avoid boundary effect, the original text did not add padding, Instead, when calculating the loss value, only the pixels in the central area are calculated . The original text is not added padding, The size of the original drawing has passed srcnn Three-layer convolution of , The resolution will become smaller .

3、 ... and 、 experiment

Compare the convolution kernel size (filter size)、 Number of convolution nuclei (filter numbers) Experiment on the effect of restoration

Conclusion : The more convolution kernels , That is, the higher the dimension of the eigenvector , The better the experimental effect , But it will affect the speed of the algorithm , Therefore, comprehensive consideration is needed ; The larger the convolution kernel size of the other three convolution layers , The experimental effect will be slightly better , It will also affect the speed of the algorithm .

Compare the network layers (layer numbers) Experiment on the effect of restoration

Conclusion : Not the deeper the network , The better the result. , The opposite is true . The author also gives an explanation : because SRCNN There is no pooling layer and full connectivity layer , As a result, the network is very sensitive to initial parameters and learning rate , The result is that it is very difficult to converge during network training , Even if it converges, it may stop at the bad local minimum (bad local minimum) It's about , And even if you train enough time , Learned filter The dispersion of parameters is not good enough .

Experiment on the effect of channel on restoration

Conclusion :RGB Channel joint training is the best ;YCbCr Under the channel ,Cb、Cr Channels are basically not helpful for performance improvement , Based on Y The training effect of channel is better .

Four 、 Conclusion

SRCNN Propose a lightweight end-to-end network SRCNN To solve the super score problem , Indeed, at that time, it achieved better performance than traditional methods 、 Faster effect , In addition, the author will be based on SC( Sparse coding ) The super division method of is understood as a form of convolutional neural network , Are all highlights worth reading .

5、 ... and 、 Address of thesis

Address of thesis :https://arxiv.org/abs/1501.00092

原网站

版权声明
本文为[leon. shadow]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207060318353601.html