当前位置:网站首页>Lightweight Pyramid Networks for Image Deraining
Lightweight Pyramid Networks for Image Deraining
2022-07-04 01:19:00 【Programmer base camp】
author : Department of communications, Xiamen University
SmartDSP
laboratory , Project websitesubject : Lightweight pyramid network for image rain removal
Abstract
- The existing rain removal network exists A lot of training parameters , It is not conducive to the landing use of the mobile terminal
- utilize Special areas Knowledge simplifies the learning process :
- introduce Gaussian Laplace image pyramid decomposition technology , Therefore, the learning task at each level can be simplified into a shallow network with less parameters
- utilize recursive and residual Network module , So in parameter quantity Less than 8000 It can also achieve the best rain removal effect at present
introduction
Rainy days are common weather systems , Not only affect people's vision , It also affects the computer vision system , Such as : Autopilot 、 The monitoring system etc. . because Refraction and scattering of light , Objects in the image are easy Blurred and covered by rain . When encountering heavy rain , Dense rain will make this phenomenon more serious . Because the input setting of the current computer vision system is a clean and clear image , Therefore, the accuracy of the model will easily deteriorate in rainy days . therefore , Designing efficient and useful rain removal algorithms is very important for the landing of many applications .
Related work
Video rain removal : The spatiotemporal information of adjacent frame time can be used , For example, the average density between adjacent frames is used to remove rain from the static background . There are the following ways :
- In Fourier domain , Using Gaussian mixture model 、 Low rank approximation 、 Complete through matrix
- Divide the rain into sparse areas and dense areas , Then use the algorithm based on matrix decomposition
- Patch based Gaussian mixture matrix
Single frame picture to rain : Single frame rain removal cannot use adjacent frame information , So it's more difficult to go to the rain than video . There are the following ways :
- Using the kernel 、 Low rank approximation 、 Dictionary learning
- Kernel regression and nonlocal mean filter
- Divide the image into high-frequency and low-frequency parts , For the high-frequency part, dictionary learning based on sparse coding is used to separate and remove rain
- Carry out self-learning from the high-frequency part to remove rain
- Discriminative coding : By forcing the sparse vector of the rain layer to be sparse , The objective function can solve the problem of separating the background from the rain
- Hybrid model and local gradient descent
- Gaussian mixture model
GMMs
: The background layer learns from natural images , The rain layer learns from images with rain - Variable direction multiplier
ADMM
- Deep learning method based on convolutional neural network
Contribution of this paper
- Put forward
LPNet
Model , This model The parameter quantity is less than 8k, Compared with the previous deep neural network , It is more conducive to the application landing on the mobile terminal . - utilize Special areas Knowledge simplifies the learning process
- For the first time The pyramid of Laplace Module to decompose degraded pictures and rainy pictures to different network levels
- Use Recursive module and Residual module Establish a sub network to reconstruct each level of the rain removal image The pyramid of Gauss
- According to the physical characteristics of different levels special
Loss
function - Realized Best at present The effect of removing rain , It also has strong applicability in other computer vision fields
LPNet A network model
Motivation
- The rain will be blocked by the edge of the object and the background , So it is difficult to learn in the image domain , But you can High frequency information Learn from , Because the high-frequency part mainly contains the rain and object edge information without the interference of image background
- Previously, in order to achieve the above operations , utilize Pilot filter To obtain high-frequency information as network input , Then, the rain is removed and the background is fused , But in On the thin rain It's hard to extract high-frequency information
- Based on the above decomposition idea , Put forward Lightweight pyramid neural network Simplify the training process , At the same time, reduce the number of parameters
Stage I : The pyramid of Laplace
The Laplace pyramid is obtained through the operation of Gauss pyramid , It is through Gaussian down sampling and then Laplace up sampling , It depends on subtracting the original image to obtain the high-frequency residual 《 The relationship between Laplace pyramid and Gauss pyramid 》
Ln(X)=Gn(X)−upsample(Gn+1(X))L_n(X) = G_n(X) - upsample(G_{n+1}(X))Ln(X)=Gn(X)−upsample(Gn+1(X))
Gn(X) Gauss pyramid Ln(X) It's the pyramid of Laplace G_n(X) Gauss pyramid \qquad L_n(X) It's the pyramid of Laplace Gn(X) yes high Si gold word tower Ln(X) yes PULL universal PULL Si gold word tower
Choose the traditional Laplace pyramid advantage :
- Background removal , So the network only needs to deal with High frequency information
- Make use of the learning of each layer sparsity
- Make each layer of learning more like Identity mapping
identity mapping
- stay
GPU
It is easy to calculate
Stage II : Sub network structure
For each level , Are the same network structure , Just convolution kernel kernel
The quantity will be adjusted according to the characteristics of different levels , Mainly adopts residual learning
and recursive blocks
- feature extraction
Hn,0=σ(Wn0∗Ln(X)+bn0)H_{n,0} = \sigma(W_{n}^{0}*L_n(X)+b_{n}^{0})Hn,0=σ(Wn0∗Ln(X)+bn0)
- Recursive module
recursive blocks
: Use parameters to share with several othersrecursive blocks
To reduce the amount of training parameters , It mainly applies everyblock
Use three convolutions as follows :
Fn,t1=σ(Wn1∗Hn,t−1+bn1),Fn,t2=σ(Wn2∗Fn,t1+bn2),Fn,t3=Wn3∗Fn,t2+bn3,F_{n,t}^{1}=σ(W_{n}^{1}*H_{n,t-1}+b_{n}^{1}), \\ F_{n,t}^{2}=σ(W_{n}^{2}*F_{n,t}^{1}+b_{n}^{2}), \\ F_{n,t}^{3}=W_{n}^{3}*F_{n,t}^{2}+b_{n}^{3},Fn,t1=σ(Wn1∗Hn,t−1+bn1),Fn,t2=σ(Wn2∗Fn,t1+bn2),Fn,t3=Wn3∗Fn,t2+bn3,
F{1,2,3} It's an intermediate feature ,W{1,2,3} and b{1,2,3} It's in multiple block Parameters shared in F^{\{1,2,3\}} It's an intermediate feature ,W^{\{1,2,3\}} and b^{\{1,2,3\}} It's in multiple block Parameters shared in F{1,2,3} yes in between , sign ,W{1,2,3} and b{1,2,3} yes stay many individual block in common enjoy Of ginseng Count
In order to facilitate forward propagation and reverse derivation , utilize Residual module
To add the input to the recursive module output :
Hn,t=σ(Fn,t3+Hn,0)H_{n,t}=σ(F_{n,t}^{3}+H_{n,0})Hn,t=σ(Fn,t3+Hn,0)
- Gaussian pyramid reconstruction :
Ln(Y)=(Wn4∗Hn,T+bn4)+Ln(X),L_n(Y) = (W_{n}^{4}*H_{n,T}+b_{n}^4)+L_n(X),Ln(Y)=(Wn4∗Hn,T+bn4)+Ln(X),
The above formula is The output of the Laplace pyramid , Then the corresponding Gaussian pyramid reconstruction result is :
GN(Y)=max(0,LN(Y)),Gn(Y)=max(0,Ln(Y)+upsample(Gn+1(Y))),G_N(Y)=max(0,L_N(Y)),\\ G_n(Y)=max(0,L_n(Y)+upsample(G_{n+1}(Y))),GN(Y)=max(0,LN(Y)),Gn(Y)=max(0,Ln(Y)+upsample(Gn+1(Y))),
Because the output of each layer should ≥0
, So there should be x=max(0,x)
namely ReLU
function
The final output image of rain removal is
G1(Y)G_1(Y)G1(Y)
Loss Function
MSE(L2)
: Due to the square penalty, the training on the edge of the image is poor , The generated image will Too smooththerefore , Different methods are adopted for different network levels
Loss
To adapt to different network characteristicsL1 + SSIM
: Because better image details and raindrops exist in the lower pyramid network ( Such as Fig 3), So useSSIM
To train the corresponding sub network Keep more high-frequency informationL1
: Because there are larger object structures and smooth background areas in the higher pyramid network , So just useL1 Loss
To update network parameters
L=1M∑i=1M{∑n=1NLl1(Gn(Yi),Gn(YGTi))+∑n=12LSSIM(Gn(Yi),Gn(YGTi))},L = \frac{1}{M}\sum_{i=1}^{M}\{\sum_{n=1}^{N}L^{l1}(G_n(Y^i),G_n(Y_{GT}^{i}))+ \sum_{n=1}^{2}L^{SSIM}(G_n(Y^i),G_n(Y_{GT}^i))\},L=M1i=1∑M{n=1∑NLl1(Gn(Yi),Gn(YGTi))+n=1∑2LSSIM(Gn(Yi),Gn(YGTi))},
Remove Batch Normalization
layer
- Join in
BN
Layer is to make the feature mapping of training obey Gaussian distribution( Normal distribution )
- But the lower data of the Laplace pyramid is sparse
( Such as Fig3 Histogram )
, It is relatively easy to handle , So no need BN Layer constraint - Remove
BN layer
After that, the model is more flexible and has fewer parameters
Parameter setting
Fixed smooth kernel
[0.0625,0.25,0.375,0.25,0.0625]
It is used for the construction of Laplace pyramid and the reconstruction of Gauss pyramidSuch as
Fig3
, Rain is at a lower level , Higher level is closer to feature mapping , So the higher level only needs less training parameters , So from low level to high level , The number of network cores decreases[16, 8, 4, 2, 1]
. On the top floor , The image is small and smooth , Rain still exists in the high-frequency part , But its learning is more like simple Global contrast adaptive learning .Such as
Fig5
, By means ofRainy img
andOur Result
Each level of , You can see it : Rain is still at a low level , However, the top level is almost the same . so , It is necessary to remove redundant cores at higher levels
Experimental parameters
dataset
: Synthetic 1800 Zhang Yuyu image + 200 Zhang Xiaoyu imageoptimizer
:Adammini-batch size
: 10learning rate
:0.001epochs
:3
experimental result
By comparing multiple models in Synthetic rain map [ The heavy rain 、 Light rain ]、 Real world rain map 、 Defogging Contrast , Main comparison criteria : Visual comparison of subjective image output 、 Quantitative comparison ( Subjective evaluation score 、PSNR、SSIM)
Other contents of the paper
Because the central idea of the paper is the above part , Although other thoughts are also mentioned , But not as the core
LPNet
When using more network parameters , Can show better effect . But for lightweight applications , So I didn't add itskip connection
It is conducive to forward propagation and reverse derivationSSIM Loss
Based on local image features : Local contrast 、 brightness 、 details . More in line with Human eye system .LPNet
It is also applicable to other computer vision tasks , Such as denoising 、 Demist and other fields- Because the network is lightweight , So it can be used as the preprocessing of other computer vision tasks : Use first in target detection on rainy days
LPNet
Go to the rain network
An interesting phenomenon : The visual effect is better , however PSNR It's lower
边栏推荐
- Regular expression of shell script value
- PMP 考试常见工具与技术点总结
- Ka! Why does the seat belt suddenly fail to pull? After reading these pictures, I can't stop wearing them
- Force deduction solution summary 1189- maximum number of "balloons"
- Oracle database knowledge points that cannot be learned (III)
- About uintptr_ T and IntPtr_ T type
- It's OK to have hands-on 8 - project construction details 3-jenkins' parametric construction
- Hbuilder link Xiaoyao simulator
- 打印菱形图案
- 不得不会的Oracle数据库知识点(二)
猜你喜欢
How can enterprises optimize the best cost of cloud computing?
Pratique technique | analyse et solution des défaillances en ligne (Partie 1)
2-redis architecture design to use scenarios - four deployment and operation modes (Part 2)
Print diamond pattern
Release and visualization of related data
HackTheBox-baby breaking grad
GUI application: socket network chat room
Ka! Why does the seat belt suddenly fail to pull? After reading these pictures, I can't stop wearing them
Future源码一观-JUC系列
技术实践|线上故障分析及解决方法(上)
随机推荐
[common error] UART cannot receive data error
The FISCO bcos console calls the contract and reports an error does not exist
51 MCU external interrupt
Summary of JWT related knowledge
【.NET+MQTT】. Net6 environment to achieve mqtt communication, as well as bilateral message subscription and publishing code demonstration of server and client
All in one 1412: binary classification
Force deduction solution summary 1189- maximum number of "balloons"
Beijing invites reporters and media
GUI 应用:socket 网络聊天室
AI helps make new breakthroughs in art design plagiarism retrieval! Professor Liu Fang's team paper was employed by ACM mm, a multimedia top-level conference
Query efficiency increased by 10 times! Three optimization schemes to help you solve the deep paging problem of MySQL
Unity Shader入门精要读书笔记 第三章 Unity Shader基础
C import Xls data method summary III (processing data in datatable)
【.NET+MQTT】.NET6 环境下实现MQTT通信,以及服务端、客户端的双边消息订阅与发布的代码演示
AI 助力艺术设计抄袭检索新突破!刘芳教授团队论文被多媒体顶级会议ACM MM录用
Stringutils and collectionutils
[dynamic programming] leetcode 53: maximum subarray sum
Huawei rip and BFD linkage
MySQL deadly serial question 2 -- are you familiar with MySQL index?
About uintptr_ T and IntPtr_ T type