当前位置:网站首页>The principle of normal equation method and its difference from gradient descent method
The principle of normal equation method and its difference from gradient descent method
2022-07-26 22:01:00 【TranSad】
Normal equation method is another solution similar to gradient descent method, which can be used to solve multiple linear regression problems . Different from the gradient descent method, it needs iterative updates again and again , The normal equation method only needs to solve the equation , You can get the optimization result . This article will briefly introduce its principle and the difference with the gradient descent method .
First, let's take a look at such a picture to lead to a problem environment . The following figure J(θ1) Loss function Cost Function. Suppose it has only one parameter θ1, We use the formula in the figure below ( Gradient descent method ) The method in , You can find a local or global optimal solution .

But we are not going to talk about the gradient descent method , The additional noteworthy thing in the above figure is , At this point J(θ1) Yes θ1 The derivative of is 0. Now we come to Gaowei ( Here's the picture ), There are multiple parameters θ0 and θ1, Our optimal solution is at the position indicated by the arrow in the figure . similarly , At this point ,J(θ0,θ1) The derivative for all parameters is 0.

First, add a little premise knowledge :
Suppose we only have two parameters at present θ0 and θ1 Need to optimize , At this time, the initialization fitting line is :

The loss function is :

Now let's expand the parameter to θn. Because we know : Since the derivative of the point where the optimal solution is located to all parameters is 0, Then we can write these constraints first :

This is just a system of equations ? Yes n+1 Unknown variables ( from θ0 To θn), as well as n+1 Equation . We just need to solve the equation , You can get the best θ As our result . Finally, we solve the equation , You can sort it out θ Final form :

such , The idea of normal equation method is over .
wait a minute ! I wonder after you read the form of the final conclusion , Do you feel that we have made a big circle ? actually , In the way above , our θ Not easy to solve ( Although I have written the solution ). And we use another equivalent and more direct way , The results of the normal equation method can also be obtained .
Specifically θ How to calculate , Here is a specific example of house price prediction , To describe more clearly X and y as well as θ Form of relationship , And leads to another way to find the conclusion of the normal equation method . In the following example , We hope to use the existing data set , Fit a containing 5 Parameters θ0 To θ4 Regression curve of . Just look at the picture :

The essence of this idea is : We require an optimal θ, You can directly assume Xθ=y, So as to push back θ Come on . This is a loose but concise idea .
stay The first ① Step in , We know that if we want to get theta, Just multiply both sides by X The inverse matrix of , But most of the time X It doesn't have to be a square , There is not necessarily an inverse matrix , So we all multiply at the same time X The transpose matrix of , Let it become a square array . And then The first ② Step in , We can move the square matrix to the right by multiplying it by its inverse matrix , obtain θ The final solution of .
Gradient Descent Gradient descent method and Normal Equation Method The difference between normal equation method
1. The former needs to set the learning rate α; The latter does not need ;
2. The former requires multiple iterations ; The latter need not , Solve directly .
3. When X Dimensions n When a large , Gradient descent method is more suitable ; Because the matrix operation involved in the normal equation method will be very slow .
4. Gradient descent is easy to fall into local optimal solution ; The normal equation method is easier to find the optimal solution .
边栏推荐
- FreeRTOS个人笔记-事件
- 仅需一个依赖给Swagger换上新皮肤,既简单又炫酷
- Leetcode exercise - Sword finger offer II 005. maximum product of word length
- Ansible installation and use
- 七、微信小程序运行报错:Error: AppID 不合法,invalid appid
- 虾皮shopee根据ID取商品详情 API
- MOS 管示意图
- 深入源码剖析String类为什么不可变?(还不明白就来打我)
- 技术分享 | 服务端接口自动化测试, Requests 库的这些功能你了解吗?
- Four solutions of distributed session
猜你喜欢

Resume in 2022 is dead in the sea. Don't vote. Software testing positions are saturated

7、 Wechat applet running error: error: illegal appid, invalid appid

Selenium自动化测试面试题全家桶

matlab 画短时平均幅度谱

Jd.com: how does redis realize inventory deduction? How to prevent goods from being oversold?

matlab 画短时能量图

A friend with a monthly salary of 50000 told me that you were just doing chores

《暑假每日一题》Week 7:7.18 - 7.24

内容管理工具,用蓝色书签就足够

五、小程序报错:message:Error: 系统错误,错误码:80058,desc of scope.userLocation is empty
随机推荐
Go----Go 语言中的标识符和关键字
知识库工具 | 微网站、文档中心、形象展示页拖拽即可生成(附模板,直接用)
My SQL is OK. Why is it still so slow? MySQL locking rules
Li Kou daily question - day 43 -168. Name of Excel table column
Afnetworking understand
JDBC operation and entry case of MySQL
JDBC总结
Huawei released the top ten trends in 2025: 5g, robot, AI, etc
《暑假每日一题》Week 7:7.18 - 7.24
Selenium自动化测试面试题全家桶
安全浏览器“隐身”模式可以查看历史记录吗?
ansible安装及使用
Isilon's onefs common operation commands (I)
Samsung releases 108million pixel image sensor isocell bright HMX, and Xiaomi will launch
Pytorch 使用RNN模型构建人名分类器
从手动测试,到自动化测试老司机,只用了几个月,我的薪资翻了一倍
Tester: "I have five years of testing experience" HR: "no, you just used one year of work experience for five years."
OPPO 自研大规模知识图谱及其在数智工程中的应用
美国再次发难:禁止承包商采购这5家中国公司的设备与技术
A new technical director asked me to do an IP territorial function~