当前位置:网站首页>Article reproduction: super resolution network fsrcnn
Article reproduction: super resolution network fsrcnn
2022-07-28 01:38:00 【GIS and climate】
SRCNN Through three convolution layers to complete the super division of the image , Feature extraction 、 Feature mapping and image reconstruction .
But its input needs to be pre interpolated into the target size , There are certain defects in speed , It is difficult to apply to real-time research , therefore Dong wait forsomeone (SRCNN The first author of ) Improved again SRCNN, Hence the FSRCNN, Among them F representative Fast.
Summary of the original
As a successful deep model applied in image super-resolution (SR), the Super-Resolution Convolutional Neural Network (SRCNN) has demonstrated superior performance to the previous hand- crafted models either in speed and restoration quality. However, the high computational cost still hinders it from practical usage that demands real-time performance (24 fps). In this paper, we aim at accelerating the current SRCNN, and propose a compact hourglass-shape CNN struc- ture for faster and better SR. We re-design the SRCNN structure mainly in three aspects. First, we introduce a deconvolution layer at the end of the network, then the mapping is learned directly from the original low-resolution image (without interpolation) to the high-resolution one. Second, we reformulate the mapping layer by shrinking the input feature dimension before mapping and expanding back afterwards. Third, we adopt smaller filter sizes but more mapping layers. The proposed model achieves a speed up of more than 40 times with even superior restora- tion quality. Further, we present the parameter settings that can achieve real-time performance on a generic CPU while still maintaining good performance. A corresponding transfer strategy is also proposed for fast training and testing across different upscaling factors.
Its new network architecture is :

For the original SRCNN, Its improvements mainly include :
At the end of the network, a deconvolution layer , That is, it is used for sampling , In this way, the input image does not need to be interpolated first , You can input LR Images , Then get HR Images ; For the middle feature mapping layer , First shrink the feature dimension , Then expand it ; With a smaller convolution kernel 、 More feature mapping layers ; The activation function uses ReLU, That is to say PReLU;
The model code
Talked about before , use Pytorch Writing code is building blocks , Just look at the network structure and write it a little , The following line is clearer ( From reference link 【2】):

Its structure is like an hourglass , It is symmetrical on the whole , Both ends are thick , Fine in the middle .
Interestingly, the new structure looks like an hourglass, which is symmetri- cal on the whole, thick at the ends, and thin in the middle.
class FSRCNN(nn.Module):
def __init__(self, inchannels):
super(FSRCNN, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(in_channels=inchannels, out_channels=64, kernel_size=5, stride=1, padding=2, padding_mode='replicate'),
nn.PReLU()
)
self.shrinking = nn.Sequential(
nn.Conv2d(in_channels=64,out_channels=32,kernel_size=1, stride=1, padding=0, padding_mode='replicate'),
nn.PReLU()
)
self.mapping = nn.Sequential(
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride =1, padding=1, padding_mode='replicate'),
nn.PReLU()
)
self.expanding = nn.Sequential(
nn.Conv2d(in_channels=32, out_channels=64, kernel_size=1, stride=1, padding=0),
nn.PReLU()
)
self.deconv = nn.Sequential(
nn.ConvTranspose2d(in_channels=64, out_channels=inchannels, kernel_size=9, stride=3, padding=4, padding_mode='replicate')
)
def forward(self, x):
x = self.features(x)
x = self.shrinking(x)
x = self.mapping(x)
x = self.expanding(x)
x = self.deconv(x)
return x
Code interpretation
On the whole ,FSRCNN It is divided into the above 5 Parts of :
features Used from the original LR Extract features from images , For the characteristics of the image , Generally, it is extracted by convolution . Notice a sensitive parameter here ( The original text says sensitive variable, I understand that parameters can be changed according to your own needs ) Namely filter The number of , That is, the convolution output of this layer channel Number . shrinking According to the original author, the function of this part is to reduce the amount of calculation , Personally, I think it's compression Latitude of characteristics , For example, in my code above, I put 64 The feature of dimension is compressed to 32 weft , Naturally, the amount of calculation is reduced . mapping This layer is exactly called nonlinear mapping ,Non-linear Mapping, It is through Roll roll , Nonlinear transformation of features . According to the author , This part is the most important part for the model results . The following article said this . expanding This layer is to restore the compressed feature dimension . deconvolution deconvolution , This layer is to restore the processed features above to HR Images .
A small summary
There are several parameters in the model that can be changed according to your own research , Include :
Number of feature maps : It is the output of the first convolution layer channel Count Compressed feature dimension : How many dimensions are compressed from the above feature dimension ? Number of nonlinear mapping layers : Is in the mapping That module should have several convolution layers ( Note that this does not change channel Count ) The multiple of deconvolution : Is how many times you want to enlarge the image
The first four modules of the whole model generally play a role from low score LR The role of extracting features from the image , Then the last module is to upsample the image , You can modify the last module according to your different up sampling multiple requirements , So the author says it is better to transfer learning . The original author said padding The way to the final result is negligible , Therefore, according to the size of convolution kernel, the corresponding 0 fill , But I think this step can be adjusted according to my own research padding The way , Especially if you pay more attention to pixel values . In order to train faster , The original author also used 2 Step training , First use 91-image dataset Training , When it converges (satureated?) Then add General-100 The dataset goes on fine-tuning. For convolution layer lr=10e-3, The final deconvolution layer is lr=10e-4. Several parameters used in the above code (sensitive variable) Inconsistent with the original .
reflection
The so-called feature extraction , What exactly is extracted ? The so-called nonlinear mapping , What has been done ?
Doubt and attention
The nonlinear mapping layer in the structure in the original text should be m individual conv The layer is followed by an activation function PReLU, But I think many people on the Internet write about everyone conv There is one behind each layer PReLU, This difference needs to be tested ...
2. The last parameter of deconvolution layer should be calculated carefully according to the multiple of output .

Reference resources
【1】DONG C, LOY C C, TANG X. Accelerating the Super-Resolution Convolutional Neural Network[C]//Computer Vision – ECCV 2016.Springer International Publishing,2016:391-407. 10.1007/978-3-319-46475-6_25.
【2】https://towardsdatascience.com/review-fsrcnn-super-resolution-80ca2ee14da4
【3】https://github.com/Lornatang/FSRCNN-PyTorch
边栏推荐
- Can anime characters become "real people"? Paddegan helps you find the TA of "tear man"
- 文章复现:超分辨率网络FSRCNN
- Wentai technology acquired the remaining equity of ANSYS semiconductor and obtained unconditional approval
- 如何让数字零售承接起流量时代和留量时代的发展重任,或许才是关键所在
- Day 013 一维数组练习
- BSP video tutorial issue 21: easy one key implementation of serial port DMA variable length transceiver, support bare metal and RTOS, including MDK and IAR, which is more convenient than stm32cubemx (
- Dart 代码注释和文档编写规范
- Tool function: pay the non empty field value in one workspace to the same field in another workspace
- 【游戏】任天堂Nintendo Switch超详细购买/使用指南以及注意事项(根据自己使用持续更新中...)
- Codeforces summer training weekly (7.14~7.20)
猜你喜欢

Tool function: pay the non empty field value in one workspace to the same field in another workspace

PHP利用某些函数bypass waf探讨

EWM receiving ECC delivery note verification logic problem

Learn how Baidu PaddlePaddle easydl realizes automatic animal recognition in aquarium

普通设备能不能接入TSN时间敏感网络?

还在用WIFI你就OUT了:LI-FI更牛!!!

The cooperation between starfish OS and metabell is just the beginning

梳理 SQL 性能优化,收藏经典!

Unknown database ‘xxxxx‘

Data problems can also be found if there is a space at the end of the field value of MySQL query criteria
随机推荐
【样式集合1】tab 栏
Dart 代码注释和文档编写规范
Knowledge of two-dimensional array
Tool function: pay the non empty field value in one workspace to the same field in another workspace
【分布式开发】之 CAP 原则
路由策略简介
Qlib教程——基于源码(二)本地数据保存与加载
Realize ABCD letter increment
Anfulai embedded weekly report no. 275: 2022.07.18--2022.07.24
Software test interview question: how to prepare test data? How to prevent data pollution?
Let's move forward together, the 10th anniversary of Google play!
华为“天才少年”稚晖君又出新作,从零开始造“客制化”智能键盘
软件测试面试题:你们的性能测试需求哪里来?
What is the opening time of London Silver
Thoroughly understand kubernetes scheduling framework and plug-ins
Day 013 one dimensional array exercise
彻底搞懂kubernetes调度框架与插件
EWM receiving ECC delivery note verification logic problem
Codeforces暑期训练周报(7.21~7.27)
8000 word explanation of OBSA principle and application practice