当前位置：网站首页>[Gan] Introduction to Gan basics and dcgan

[Gan] Introduction to Gan basics and dcgan

2022-06-22 06:56:00 【chad_ lee】

Basic Idea of GAN

Insert picture description here

Take image generation as an example , Suppose there are two networks ,G（Generator） and D（Discriminator）：

G Is a network that generates pictures , It receives a random noise z, Generate a picture from this noise , Remember to do G(z).
D It's a discriminant network , Judge whether a picture is “ Actual ”. Its input parameter is x,x For a picture , Output D（x） representative x For the probability of a real picture , If 1, On behalf of 100% It's a real picture , And the output is 0, It means that it can't be a real picture .

Training objectives

During training , Generation network G The goal is to generate real pictures as much as possible to cheat the discrimination network D. and D The goal is to try to G The generated image is different from the real image .

The ideal result of training is G The generated image can be confused with the real , Cheated D, Use the explanation in the article as follows ：

Insert picture description here

Black dotted line Represents the distribution of real samples , Blue dotted line Indicates the distribution of the discriminant probability of the discriminator , Green solid line Represents the distribution of generated samples . $Z$ Noise representation , $Z$ To $x$ Represents the mapping of the distribution after passing through the generator （G Credit ）. The figure above contains two messages ：（1） generator G More and more fitting to real pictures ： The green solid line increasingly coincides with the black dotted line 、 Random noise $z$ after G Mapped $x$ More and more consistent with the real distribution .（2） Judging device D I can't tell the difference between real and generated pictures , That is, the discriminant probability tends to 0.5, At first, the blue dotted line still distinguishes , After training, there is no distinction .

Training methods

GAN The goal of optimization ：
$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log (1-D(G(\boldsymbol{z})))]$

The whole formula consists of two terms . $x$ Represents a real picture , $z$ Indicates input $G$ The noise of the network , and $G (z)$ Express $G$ Web generated images .
$D (x)$ Express $D$ The probability of the Internet judging whether a real picture is real or not （ because x It's real , So for $D$ Come on , The closer the value is 1 The better ）. and $D (G (z))$ yes $D$ Network judgment $G$ The probability that the generated image is true or not .
$G$ Purpose ： $D (G (z))$ yes $D$ Network judgment $G$ The probability that the generated image is real or not , $G$ You should want to create your own images “ The closer to reality the better ”. in other words , $G$ hope $D (G (z))$ As big as possible , At this time $V (D, G)$ It's going to get smaller . So we see that the first sign of the formula is $min_G$ .
$D$ Purpose ： $D$ The more capable , $D (x)$ The bigger it should be , $D (G (x))$ The smaller it should be . At this time $V (D, G)$ It's going to get bigger . So for $D$ It's about maximizing $max_D$

GAN Training process ：

Insert picture description here

here $G$ and $D$ Alternate Training , The first step is training $D$ hope $V (G, D)$ The bigger the better , So we're adding gradients (ascending). Step 2 training $G$ when , $V (G, D)$ The smaller the better. , So it's minus the gradient (descending). The whole training process alternates .

Optimization objectives

For fixed $G$ , $D$ The optimal solution is
$D_{G}^{*}(\boldsymbol{x})=\frac{p_{\text {data }}(\boldsymbol{x})}{p_{\text {data }}(\boldsymbol{x})+p_{g}(\boldsymbol{x})}$

DCGAN

DCGAN It is also an early model .GAN Apply to image generation ,CNN Good at image processing , So the combination of the two is natural .

$G$

Insert picture description here

The length of the random input vector is 100, Four layer transposed convolution layer . Not used pooling.D and G Both of them are used batch normarlization. Get rid of FC, Make the model into a full convolution network .G Network usage ReLU As an activation function , The last layer uses tanh.D Network usage LeakyReLU As an activation function .

原网站

版权声明
本文为[chad_ lee]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202220543470212.html