当前位置:网站首页>Gan: generative advantageous nets -- paper analysis and the mathematical concepts behind it
Gan: generative advantageous nets -- paper analysis and the mathematical concepts behind it
2022-07-28 04:56:00 【gongyuandaye】
One 、 Purpose
Input : Random noise
Output : Sample the image of self training sample distribution
Two 、 Model
Generate models G: Capture data distribution , Expect to produce as real a picture as possible , Cheated D.
Discriminant model D: Evaluate a sample from a training set rather than G The possibility of , Expect to accurately distinguish true and false pictures .
3、 ... and 、 Training
Two person system minimax game :
Algorithm is as follows :
among :
1、 Complete in the inner loop D The optimization of is complex , And it will be fitted on a limited data set , So finish first k round D The optimization of the ( The gradient rises ), Finish another round G The optimization of the ( gradient descent , There are changes in actual use ). The result shows that as long as G Change is slow enough ,D It can be maintained near the optimal solution .
2、min The process ( gradient descent ) in , When the generated sample is very bad , The output value of the discriminator will be very small , The gradient of generator loss function here is very small , It makes the generator learn very slowly . contrary , When the generated sample is good , The output value of the discriminator will be large , The generator loss function has a large gradient here , Generator update is large .
So for min Make the following transformation :
3、 When the distribution of training samples and data does not coincide , It is impossible to update the gradient , So there is LSGAN、WGAN etc. .
Four 、 Mathematical concepts
4.1 Maximum likelihood estimation
P P P d a t a data data ( x ) (x) (x):G The data distribution of the image you want to find .
P P P G G G ( x ; θ ) (x;θ) (x;θ): from θ Determined distribution .
The goal is : determine θ Make the above two distributions similar , hypothesis P P P G G G ( x ; θ ) (x;θ) (x;θ) Gaussian mixture model ,θ The matrix and covariance matrix of Gaussian distribution .
step :
from P P P d a t a data data ( x ) (x) (x) Sample {𝑥1, 𝑥2, … , 𝑥𝑚}, Calculation P P P G G G ( x ; θ ) (x;θ) (x;θ) The likelihood of producing these samples :
By maximum likelihood estimation , We can find θ*:
( The subtraction of the penultimate step and θ irrelevant , from KL The last step of divergence definition is min)
Maximize likelihood = To minimize the KL The divergence ( A measure of the similarity between two probability distributions , Asymmetric )
4.2 Definition of generator
utilize neural network To build the generator G, The network output defines a density distribution 𝑃𝐺.
P P P d a t a data data Unknown , Only its sampling data can be obtained . Sample from a prior probability distribution z As input . To make the two distributions close, we need to calculate the divergence of the two distributions , Need a discriminator .
4.3 Definition of discriminator

because G Fixed , Therefore, the training should be maximized V(G, D) Of The optimal D*.
The training process has the following characteristics :
Maximize V(G, D) The formula of is derived as follows :
So equation (4) Can have :
Thus there are :
among :
1、 The sum of probability distributions is 1.
2、JSD Express JS The divergence ( symmetry ), It is KL A distortion of divergence , It also indicates the difference between the two distributions :
Go back to the original formula (1), Now come back to beg The optimal G, After the above derivation, there are :
JS by 0, Indicates that the two distributions are exactly the same , The upper bound is log2, That is, when the two distributions do not coincide , here GAN Obviously, I can't train .
4.4 The calculation process
To minimize JSD When , Think of it as a loss function L(G), Through gradient descent :
To maximize V(G, D) when , Using the gradient rise method .
So how to calculate in a given G In the case of V(G, D) To maximize the D* Well ?
Maximize the following formula :
namely :
The complete calculation process is as follows :
边栏推荐
- 启发国内学子学习少儿机器人编程教育
- Evolution of ape counseling technology: helping teaching and learning conceive future schools
- Flink mind map
- (3.1) [Trojan horse synthesis technology]
- FreeRTOS startup process, coding style and debugging method
- Dynamic SQL and paging
- linux下安装mysql
- 你必需要了解的saas架构设计?
- HDU 1522 marriage is stable
- FPGA:使用PWM波控制LED亮度
猜你喜欢

基于MPLS构建虚拟专网的配置实验

Real intelligence has been certified by two of the world's top market research institutions and has entered the global camp of excellence
![[Oracle] 083 wrong question set](/img/10/9a5dae9542a8fed0356843c59f3c2f.png)
[Oracle] 083 wrong question set

Improve the core quality of steam education among students

05.01 string

猿辅导技术进化论:助力教与学 构想未来学校

Array or object, date operation

go-zero单体服务使用泛型简化注册Handler路由

Know etcd

Automated test tool playwright (quick start)
随机推荐
How to quickly locate bugs? How to write test cases?
Web渗透之域名(子域名)收集方法
低代码是开发的未来吗?浅谈低代码平台
Take out system file upload
(克隆虚拟机步骤)
Real intelligence has been certified by two of the world's top market research institutions and has entered the global camp of excellence
What is the core value of testing?
[Sylar] framework -chapter7-io coordination scheduling module
启发国内学子学习少儿机器人编程教育
Look at the experience of n-year software testing summarized by people who came over the test
alter和confirm,prompt的区别
Leetcode 15. sum of three numbers
Installing MySQL under Linux
吉利AI面试题【杭州多测师】【杭州多测师_王sir】
Have you learned the common SQL interview questions on the short video platform?
Design and development of C language ATM system project
The first artificial intelligence security competition starts. Three competition questions are waiting for you to fight
(3.1) [Trojan horse synthesis technology]
RT_ Use of thread message queue
[Hongke technology] Application of network Multimeter in data center