当前位置:网站首页>Intensive reading of deep learning papers [gan]: multi-purpose image restoration and processing using depth generation priors
Intensive reading of deep learning papers [gan]: multi-purpose image restoration and processing using depth generation priors
2022-07-28 00:23:00 【Opencv school】
The author has been studying in concentrated time recently Against generative networks (GAN), In particular, depth generation priors are used for multi-purpose image restoration and processing , We need to review and read the classic papers on image restoration and processing .
From the classic of image restoration and processing DGP《Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation》 Start , Restart the road of intensive reading .
7 month 14 Japan , Algorithmic superstar 1 Speak carefully in an hour Gan The paper
Sweep code 0.1 Yuan booking live broadcast
Provide code data set
DGP A mining method is proposed GAN A priori way of image , Revealed on multiple tasks GAN Potential as a general-purpose image a priori .
The paper proposes that A gradient image inversion method with simultaneous interpolation of implicit variables and generators , It can be applied to the confrontation and defense of complex pictures , In the experiment DGP The powerful simulation ability of spatial relationships between pixels is also very interesting .
Deep generative prior Image restoration effect
01
Depth generation a priori
Depth image prior DIP Only rely on the statistical information of the input image , It cannot be applied to tasks that require more general image statistics , Such as image coloring and image editing .
We are more interested in studying a more general image priori , That is, training on large-scale natural images GAN The generator is used for image synthesis . say concretely , It's based on GAN-inversion Image reconstruction process .
In practice , Just by optimizing the hidden vector z It is difficult to reconstruct accurately ImageNet Such a complex real image . Training GAN Data set of (ImageNet) Itself is a very small part of natural pictures ,GAN Limited by limited model performance and mode collapse, There is also a gap between the simulated image distribution and the training set image distribution .
Even if the above restrictions exist ,GAN I still learned a lot of picture information , In order to use this information and achieve accurate reconstruction , Let's let the generator online Adapt to each target image , That is, joint optimization of hidden vectors z And generator parameters .
We call this new goal depth generation a priori (DGP),DGP The effect of image reconstruction is significantly improved . It is critical to design appropriate distance measurement and optimization strategies , In the reconstruction process , The generator's original generation prior has been modified , The ability to output real natural images may decline .
7 month 14 Japan , Algorithmic superstar 1 Speak carefully in an hour Gan The paper
Sweep code 0.1 Yuan booking live broadcast
Provide code data set
02
Discriminator guided progressive reconstruction
from latent space Z Hundreds of candidates are randomly selected from the initial latent code, And choose to measure L The one with the best reconstruction effect .
stay GAN Rebuilding , The traditional distance measurement method is MSE or Perceptual loss. When optimizing generator parameters , These traditional distance measures are used in image restoration, such as coloring tasks , It is often impossible to restore the color accurately , And the image will become blurred during reconstruction , We need to design better optimization methods to retain the original information of the generator .
7 month 14 Japan , Algorithmic superstar 1 Speak carefully in an hour Gan The paper
Sweep code 0.1 Yuan booking live broadcast
Provide code data set
We are in this work Choose to use the discriminator corresponding to the generator As a Distance metric . And Perceptual loss Adopted by the VGGNet Different , The discriminator is not trained on a third-party task , It is highly coupled with the generator during pre training , It is naturally suitable for adjusting the output distribution of the generator .
When using this distance measurement based on discriminator , The process of reconstruction is more natural and real , The final color recovery effect is also better .
among D(x, i) Representative to x As input, the discriminator is i individual block Characteristics of output
Although the improved distance measurement brings better results , But there are still unnatural traces in the result of image restoration , Because when the generator optimizes for the target image , Before the shallow parameters match the overall layout of the picture , Deep parameters begin to match the detail texture .
The apple chart above is a comparison of several training strategies , We can see from the three line effect , Some apples are not dyed at the beginning of training, and they are not dyed at the end , We call this phenomenon “ Information retention ”.
The countermeasure is : Strategy of using progressive reconstruction , That is, when tuning the generator , First optimize the shallow layer , Then gradually transition to the deep , Let the reconstruction process “ First the whole, then the part ”.
Compared with non progressive strategies , This progressive strategy better preserves the consistency between the missing semantics and the existing semantics .
7 month 14 Japan , Algorithmic superstar 1 Speak carefully in an hour Gan The paper
Sweep code 0.1 Yuan booking live broadcast
Provide code data set
03
Reconstruction results
Use BigGAN Model , be based on ImageNet Training , Use ImageNet Verification set 1000 Experiment with images , Take the first one of each category , Compared with other methods ,DGP Achieved a very high PSNR and SSIM, Visual reconstruction errors are almost imperceptible .
04
experiment
because GAN A priori depicting natural images , Therefore, many tasks can be completed : Such as coloring 、 completion 、 Super resolution, etc , It can also be used for image processing . Here are some renderings .
Image coloring
Use ResNet50 The classification accuracy on is taken as the quantitative evaluation result , The accuracy of the following methods are 51.5%, 56.2%, 56.0%, 62.8%.
Image completion
Super resolution
flexibility
Random disturbance
7 month 14 Japan , Algorithmic superstar 1 Speak carefully in an hour Gan The paper
Sweep code 0.1 Yuan booking live broadcast
Provide code data set
summary
GAN As one of the most powerful generative models in the image field , Learned a wealth of natural image manifolds , It can bring great help to the restoration and editing of natural images .
The ability to make good use of large-scale pre training models is the popular frontier of deep learning in various fields , It can reduce the need for training data , Integrate similar research fields .
A more powerful generative model in the future , It will bring more practical application value of image restoration and editing applications , It is expected to land in a broader field
7 month 14 Japan , Algorithmic superstar 1 Speak carefully in an hour Gan The paper
Sweep code 0.1 Yuan booking live broadcast
Provide code data set
边栏推荐
- How difficult is it to apply for a doctorate under the post system in northern Europe?
- [21 day learning challenge] classmate K invites you to participate in the in-depth learning seminar
- 【21天学习挑战赛】K同学啊 邀你参加深度学习研讨班
- Posture recognition and simple behavior recognition based on mediapipe
- Shell编程规范与变量
- Shell(3)
- Database tuning - principle analysis and JMeter case sharing
- JS ATM output
- 软件运维监控有哪些?
- 『百日百题 · 基础篇』备战面试,坚持刷题 第三话——分支语句!
猜你喜欢

【zer0pts CTF 2022】 Anti-Fermat

J9 Digital Science Popularization: how does the double consensus of Sui network work?

Oracle密码过期解决办法

【飞控开发基础教程6】疯壳·开源编队无人机-SPI(六轴传感器数据获取)

很棒的一个思维题CF1671D Insert a Progression

In the third week of July, the list of feigua data station B up main ranking list was released!

泵站远程监控

MFC提示this application has requested the runtime to terminate it in an unusual way editbox框已经删了还在使用

The 4-hour order exceeds 20000+, claiming to be "the most luxurious in a million". Is the domestic brand floating?

『百日百题 · 基础篇』备战面试,坚持刷题 第三话——分支语句!
随机推荐
XSS payload learning browser decoding
元宇宙办公,打工人的终极梦想
BUU-CTF basic rsa
Yongzhou water quality testing laboratory construction: Furniture description
学yolo需要什么基础?怎么学YOLO?
永州清洁级动物实验室建设选址注意事项
30余年的元宇宙,为我们带来了什么?
Assertion mechanism in test class
Sum of factorials of Luogu p1009 [noip1998 popularization group]
BUUCTF-Baby RSA
渲染问题
好漂亮的彩虹
BUUCTF-Dangerous RSA
[NCTF2019]babyRSA1
[MRCTF2020]babyRSA
[geek challenge 2019] rce me
C语言实现五子棋游戏
View the construction details of Yongzhou dioxin Laboratory
JS ATM output
JS ATM机输出