当前位置:网站首页>Normal equation
Normal equation
2022-06-24 10:18:00 【Wanderer001】
Reference resources Normal equation - cloud + Community - Tencent cloud
Catalog
One 、 What is a normal equation
3、 ... and 、 Irreversible case
Four 、 Comparison between normal equation and gradient descent method
One 、 What is a normal equation
The gradient descent method is used to calculate the optimal solution of parameters , The process is to find the partial derivative of each parameter of the cost function , Update step by step through iterative algorithm , Until it converges to the global minimum , The optimal parameters are obtained .
The normal equation is to find the optimal solution at one time .
thought : For a simple function , Derivation of parameters , Set the value to 0, You get the value of the parameter . Like this :

Real world examples have many parameters , We're going to find the partial derivatives for all these parameters , Get the optimal solution of each parameter , That is, the global optimal solution . But the difficulty is , This is a waste of time .
Two 、 Use of normal equations
Examples are as follows :

here 4 Samples , as well as 4 Characteristic variables x1,x2,x3,x4, The observation is y, When listing cost functions , You need to add an end parameter x0, as follows :

Then save the characteristic parameters in X Matrix , Do the same for the observations and save them in the vector y in , Pictured :

Then we get the parameters by the following formula θ Optimal solution .
![]()
About this formula :

For all the characteristic parameters of a training sample, we can use x(i) Vector to represent ( Be careful x0(i) To add ) , The design matrix can be expressed as X, Is the transpose of all sample vectors ,y Is the vector of observations , After this expression, you can use the above formula to directly calculate Θ The best solution .
3、 ... and 、 Irreversible case
Notice that the normal equation has a
The process of finding the inverse matrix , When the matrix is irreversible , There are generally two reasons :
- Superfluous features ( Linear correlation )
- Too many features ( for example :m≤n), terms of settlement : Delete some features , Or regularization
Actually , The essential reason is linear knowledge :
First , These are two necessary conditions ,
According to the nature :r(ATA) = r(A),ATA Reversibility can be transformed into A Reversibility of .
The first one is : It's actually a linearly related column vector , The rank of a matrix < The dimensions of the matrix , Irreversible ;
The second kind :
- m < n when , That is, the dimension is less than the number of vectors , Here, that is, the number of samples is less than the characteristic number , Linear correlation
- m = n when , When |A| = 0 Time is irreversible ,|A| != 0 Time reversible
Four 、 Comparison between normal equation and gradient descent method
Gradient descent method :
shortcoming :
- We need to choose the learning rate α
- It takes several iterations
advantage :
- When the characteristic parameter is large , Gradient descent also works well
Normal equation :
shortcoming :
- Need to compute
, The amount of calculation is about the third power of the matrix dimension , High complexity . - When the characteristic parameter is large , The calculation is slow
advantage :
- No need for learning rate α
- No more iterations are required
summary : Depends on the number of eigenvectors , Quantity less than 10000 when , Choose the normal equation ; Greater than 10000, Consider gradient descent or other algorithms .
边栏推荐
猜你喜欢

保健品一物一码防窜货营销软件开发

canvas掉落的小球重力js特效动画

oracle池式连接请求超时问题排查步骤

Cicflowmeter source code analysis and modification to meet requirements

Phpstrom code formatting settings

Juul, the American e-cigarette giant, suffered a disaster, and all products were forced off the shelves

小程序学习之获取用户信息(getUserProfile and getUserInfo)

411-栈和队列(20. 有效的括号、1047. 删除字符串中的所有相邻重复项、150. 逆波兰表达式求值、239. 滑动窗口最大值、347. 前 K 个高频元素)

Geogebra instance clock

SVG+js拖拽滑块圆形进度条
随机推荐
机器学习——主成分分析(PCA)
Wechat cloud hosting launch public beta: in the appointment of the publicity meeting
微信小程序學習之 實現列錶渲染和條件渲染.
Producer / consumer model
2.登陆退出功能开发
1. project environment construction
SQL Server AVG function rounding
Nvisual digital infrastructure operation management software platform
静态链接库和动态链接库的区别
6. package management business development
numpy.linspace()
413-二叉树基础
Cookie encryption 4 RPC method determines cookie encryption
canvas无限扫描js特效代码
leetCode-1089: 复写零
形状变化loader加载jsjs特效代码
PHP encapsulates a file upload class (supports single file and multiple file uploads)
5. dish management business development
2022-06-23:给定一个非负数组,任意选择数字,使累加和最大且为7的倍数,返回最大累加和。 n比较大,10的5次方。 来自美团。3.26笔试。
Distributed | how to make "secret calls" with dble