当前位置:网站首页>[ml] Li Hongyi III: gradient descent & Classification (Gaussian distribution)
[ml] Li Hongyi III: gradient descent & Classification (Gaussian distribution)
2022-07-02 23:26:00 【Exotic moon】
The normal of the contour line is perpendicular to the tangent ,:
Every time calculation
One time return , Then according to the gradient of return point ; Then calculate a new return point according to the result , Then gradient down .....
When doing gradient descent , Careful adjustment learning rate
It is better to have one for every parameter learning rate, Recommended adagrad
case: Use the root mean square of all differential values in the past
explain :
When only one parameter is considered :
When considering multiple parameters , The above discussion is not necessarily true :
Just look at w1( Blue ):a Than b Far away , Then the greater the differential value
Just look at w2( green ):c Than d Far away , Then the greater the differential value
But not when combined :a The differential value of is significantly higher than c Small , however a Farther from the origin , So cross parameter comparison , The above is not true !
So the right thing to do is : Use first-order differential value / Quadratic differential value , To calculate the distance from the lowest point
It's a constant ,
Express a differential ,
To replace quadratic differentiation ( In order to calculate )
Stochastic gradient descent :
use vector To describe an input ( Baokemeng )
Maximum likelihood estimation : Calculate the probability of the source from the results
Gaussian distribution , Suppose it's a function, Each point is sampled from the Gaussian distribution .
The closer each point is to the center of the yellow , The greater the probability of sampling
Because the probability of each blue dot is independent , So yellow dot sample all 79 The probability of a blue dot is equal to each person who wants to multiply
Calculate the maximum likelihood value : Take the average ( Because Gaussian distribution is normal distribution )
Calculate the most likely sample After the maximum likelihood of all blue dots , Start sorting :
The classification effect of water system and common system is not very good , The correct rate is only 47%, What about Shengwei ?
Add attribute to 7 only
7 Dimensional effect is still not ideal , Reduce two function Parameters of
After sharing the covariance matrix , Improved accuracy
3 Step summary :
Posterior probability :
Simplify consensus :
边栏推荐
- The difference between new and make in golang
- 数字图像处理实验目录
- Remote connection of raspberry pie by VNC viewer
- Where is the win11 microphone test? Win11 method of testing microphone
- Li Kou brush questions (2022-6-28)
- 2016. maximum difference between incremental elements
- (毒刺)利用Pystinger Socks4上线不出网主机
- 实现BottomNavigationView和Navigation联动
- How difficult is it to be high? AI rolls into the mathematics circle, and the accuracy rate of advanced mathematics examination is 81%!
- Catalogue of digital image processing experiments
猜你喜欢
随机推荐
Redis 过期策略+conf 记录
Temperature measurement and display of 51 single chip microcomputer [simulation]
万物并作,吾以观复|OceanBase 政企行业实践
可知论与熟能生巧
Alibaba cloud award winning experience: how to use polardb-x
20220524_ Database process_ Statement retention
面试过了,起薪16k
Win11如何开启目视控制?Win11开启目视控制的方法
Writing of head and bottom components of non routing components
BBR encounters cubic
BBR 遭遇 CUBIC
Arduino - 字符判断函数
聊聊内存模型与内存序
20220527_ Database process_ Statement retention
非路由组件之头部组件和底部组件书写
Implementation of VGA protocol based on FPGA
Use of recyclerview with viewbinding
Strictly abide by the construction period and ensure the quality, this AI data annotation company has done it!
ServletContext learning diary 1
Go project operation method