当前位置：网站首页>[ml] Li Hongyi III: gradient descent & Classification (Gaussian distribution)

[ml] Li Hongyi III: gradient descent & Classification (Gaussian distribution)

2022-07-02 23:26:00 【Exotic moon】

The normal of the contour line is perpendicular to the tangent ,：

Every time calculation

One time return , Then according to the gradient of return point ; Then calculate a new return point according to the result , Then gradient down .....

When doing gradient descent , Careful adjustment learning rate

It is better to have one for every parameter learning rate, Recommended adagrad

case： Use the root mean square of all differential values in the past

explain ：

When only one parameter is considered ：

When considering multiple parameters , The above discussion is not necessarily true ：

Just look at w1（ Blue ）:a Than b Far away , Then the greater the differential value

Just look at w2（ green ）:c Than d Far away , Then the greater the differential value

But not when combined ：a The differential value of is significantly higher than c Small , however a Farther from the origin , So cross parameter comparison , The above is not true ！

So the right thing to do is ： Use first-order differential value / Quadratic differential value , To calculate the distance from the lowest point

It's a constant , Express a differential , To replace quadratic differentiation （ In order to calculate ）

Stochastic gradient descent ：

use vector To describe an input （ Baokemeng ）

Maximum likelihood estimation ： Calculate the probability of the source from the results

Gaussian distribution , Suppose it's a function, Each point is sampled from the Gaussian distribution .

The closer each point is to the center of the yellow , The greater the probability of sampling

Because the probability of each blue dot is independent , So yellow dot sample all 79 The probability of a blue dot is equal to each person who wants to multiply

Calculate the maximum likelihood value ： Take the average （ Because Gaussian distribution is normal distribution ）