当前位置:网站首页>Development Series III of GaN (lapgan, srgan)
Development Series III of GaN (lapgan, srgan)
2022-07-24 17:33:00 【51CTO】
GAN Development Series III of (LapGAN、SRGAN)
We have already introduced it in the previous article GAN The introduction to generating countermeasure networks and some GAN series , In the following album will continue to introduce some of the more classic GAN.
GAN Introduction to generating countermeasure network
GAN The development of the series one (CGAN、DCGAN、WGAN、WGAN-GP、LSGAN、BEGAN)
GAN The development of series 2 (PGGAN、SinGAN)
One 、 LapGAN
The paper :《Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks》
Address of thesis :https://arxiv.org/abs/1506.05751
1、 The basic idea
LapGAN It's based on GAN and CGAN On the basis of , use Laplacian Pyramid The pyramid of Laplace To generate images from thick to thin , So as to generate high-resolution images . At each level of the pyramid, there are learning residuals with adjacent levels , By constantly stacking CGAN Get the final resolution .CGAN As we mentioned in the previous article, it is in GAN Add conditional constraints on the basis of , To alleviate the original GAN The generator generates samples too freely .
original GAN The formula of is :

CGAN The formula of is :

2、 The pyramid of Laplace
Laplacian pyramid is the result of continuous up sampling of images in scale space , Gaussian pyramid is the result of continuous down sampling of images in scale space . First build Gaussian pyramid , To image I0 For continuous K Next sampling , obtain

Is the first K The Laplace pyramid on level is

The Laplace pyramids on other levels are :

Laplacian pyramid No k The layer is equal to the Gaussian pyramid k Layer minus Gaussian pyramid k+1 Upper sampling of layer .
Use the Laplace pyramid to restore the image :

3、LapGAN principle
With K=3 For example , At this time, the pyramid of Laplace is 4 The layer structure , contain 4 A generator G0、G1、G2、G3, Generate... Separately 4 A resolution image 64x64、32x32、16x16、8x8, The lowest resolution image to train the original GAN, Input only noise , Later, higher resolution image training CGAN, Input the image sampled on the Gaussian pyramid image with noise and the same level .
LapGAN Through a series of CGAN In series , Constantly generate higher resolution .


LAPGAN stay CIFAR10、STL10 and LSUN Experiments were conducted on three data sets , The generated image is as follows :

Two 、 SRGAN
The paper 《Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network》
Address of thesis :https://arxiv.org/pdf/1609.04802.pdf
Code address :https://github.com/OUCMachineLearning/OUCML/blob/master/GAN/srgan_celebA/srgan.py
1、 The basic idea
SRGAN Yes, it will GAN Applied to the field of image super-resolution ,CNN Convolutional neural network has achieved very good results in the traditional super-resolution reconstruction , High peak signal-to-noise ratio can be achieved PSNR, With MSE Is the objective function of minimization .SRGAN Is the first to recover 4 The algorithm framework of down sampling image , The author proposes a perceptual loss function , Including confrontation loss and content loss , The counter loss comes from the discriminator , It is used to distinguish the real image from the generated super-resolution image , Content loss focuses on visual similarity .

Usually, image super-resolution algorithm uses the mean square error between the reconstructed super-resolution image and the real image MSE As an objective function , Optimize MSE So as to improve PSNR, however MSE and PSNR The value of is not a good indicator of the visual effect , The following figure PSNR The vision with the highest value is not good .

2、 Network structure
Usually per pixel MSE Due to excessive smoothing, it is difficult to deal with the super-resolution details of the image , This paper designs a new loss function , Will be per pixel MSE Replace loss with content loss . Perceived loss is expressed as the weighted sum of content loss and adversarial loss ,

Content Loss It is the loss per pixel of the feature map of a certain layer as the content loss ,

Adversarial Loss Against the loss

The network structure proposed by the author is as follows , The generator consists of a residual structure Residual blocks form ,

The author uses sub-pixel Network as a generative network , use VGG As a discriminant network GAN Got very good results , But this uses the difference per pixel as the loss function . after , The author tries to use the perceptual loss function proposed by himself as the optimization goal , although PSNR and SSIM Not high , But the visual effect is better than other networks , Avoid the over smooth characteristics of other methods .



边栏推荐
- Array learning navigation
- Socat port forwarding
- Method of querying comma separated strings in a field by MySQL
- AI opportunities for operators: expand new tracks with large models
- DHCP relay of HCNP Routing & Switching
- 实习报告1——人脸三维重建方法
- Nearly 30 colleges and universities were named and praised by the Ministry of education!
- ROS主从机通信经验总结
- Eth POS 2.0 stacking test network pledge process
- 2022 Asia International Internet of things exhibition
猜你喜欢

Separation and merging of channels

portmap 端口转发

Apachecon Asia 2022 opens registration: pulsar technology issues make a big debut

Check the actual data growth of the database

portfwd 端口转发
![[GNN report] Tencent AI Lab Xu TingYang: graph generation model and its application in molecular generation](/img/5f/c790baf8f8e62fca36fdb4492c38b2.png)
[GNN report] Tencent AI Lab Xu TingYang: graph generation model and its application in molecular generation

Two dimensional convolution -- use of torch.nn.conv2d

Kyligence attended the Huawei global smart finance summit to accelerate the expansion of the global market

Work with growingio engineers this time | startdt Hackathon

2022 Yangtze River Delta industrial automation exhibition will be held in Nanjing International Exhibition Center in October
随机推荐
AI opportunities for operators: expand new tracks with large models
Internship report 1 - face 3D reconstruction method
Cann training camp learns the animation stylization and AOE ATC tuning of the second season of 2022 model series
JSP custom tag library --foreach
List of stringutils and string methods
What is the meaning of void 0? Is undefined changeable?
Exception handling - a small case that takes you to solve NullPointerException
Preliminary study of Oracle pl/sql
安全:如何为行人提供更多保护
Array learning navigation
Preliminary understanding of redis
Scept: consistent and strategy based trajectory prediction for planned scenarios
近30所高校,获教育部点名表扬!
Are the top ten securities companies safe and risky to open accounts?
Opencv has its own color operation
2022 Yangtze River Delta industrial automation exhibition will be held in Nanjing International Exhibition Center in October
Scroll bar adjust brightness and contrast
Portfwd port forwarding
Open source Invoicing system, 10 minutes to complete, it is recommended to collect!
Pat class A - A + B format