当前位置：网站首页>The principle of normal equation method and its difference from gradient descent method

The principle of normal equation method and its difference from gradient descent method

2022-07-26 22:01:00 【TranSad】

Normal equation method is another solution similar to gradient descent method, which can be used to solve multiple linear regression problems . Different from the gradient descent method, it needs iterative updates again and again , The normal equation method only needs to solve the equation , You can get the optimization result . This article will briefly introduce its principle and the difference with the gradient descent method .

First, let's take a look at such a picture to lead to a problem environment . The following figure J(θ1) Loss function Cost Function. Suppose it has only one parameter θ1, We use the formula in the figure below （ Gradient descent method ） The method in , You can find a local or global optimal solution .

But we are not going to talk about the gradient descent method , The additional noteworthy thing in the above figure is , At this point J(θ1) Yes θ1 The derivative of is 0. Now we come to Gaowei （ Here's the picture ）, There are multiple parameters θ0 and θ1, Our optimal solution is at the position indicated by the arrow in the figure . similarly , At this point ,J(θ0,θ1) The derivative for all parameters is 0.

First, add a little premise knowledge ：

Suppose we only have two parameters at present θ0 and θ1 Need to optimize , At this time, the initialization fitting line is ：

The loss function is ：

Now let's expand the parameter to θn. Because we know ： Since the derivative of the point where the optimal solution is located to all parameters is 0, Then we can write these constraints first :

This is just a system of equations ？ Yes n+1 Unknown variables （ from θ0 To θn）, as well as n+1 Equation . We just need to solve the equation , You can get the best θ As our result . Finally, we solve the equation , You can sort it out θ Final form ：

such , The idea of normal equation method is over .

wait a minute ！ I wonder after you read the form of the final conclusion , Do you feel that we have made a big circle ？ actually , In the way above , our θ Not easy to solve （ Although I have written the solution ）. And we use another equivalent and more direct way , The results of the normal equation method can also be obtained .

Specifically θ How to calculate , Here is a specific example of house price prediction , To describe more clearly X and y as well as θ Form of relationship , And leads to another way to find the conclusion of the normal equation method . In the following example , We hope to use the existing data set , Fit a containing 5 Parameters θ0 To θ4 Regression curve of . Just look at the picture :

The essence of this idea is ： We require an optimal θ, You can directly assume Xθ=y, So as to push back θ Come on . This is a loose but concise idea .

stay The first ① Step in , We know that if we want to get theta, Just multiply both sides by X The inverse matrix of , But most of the time X It doesn't have to be a square , There is not necessarily an inverse matrix , So we all multiply at the same time X The transpose matrix of , Let it become a square array . And then The first ② Step in , We can move the square matrix to the right by multiplying it by its inverse matrix , obtain θ The final solution of .

Gradient Descent Gradient descent method and Normal Equation Method The difference between normal equation method

1. The former needs to set the learning rate α; The latter does not need ;

2. The former requires multiple iterations ; The latter need not , Solve directly .

3. When X Dimensions n When a large , Gradient descent method is more suitable ; Because the matrix operation involved in the normal equation method will be very slow .

4. Gradient descent is easy to fall into local optimal solution ; The normal equation method is easier to find the optimal solution .

原网站

版权声明
本文为[TranSad]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/207/202207262112238296.html

当前位置：网站首页>The principle of normal equation method and its difference from gradient descent method

The principle of normal equation method and its difference from gradient descent method

Gradient Descent Gradient descent method and Normal Equation Method The difference between normal equation method

边栏推荐

猜你喜欢

随机推荐