当前位置：网站首页>Understanding and application of least square method

Understanding and application of least square method

2022-07-03 00:20:00 【TranSad】

The least square method is a familiar and strange thing .

In the regression problem, we often use the least square method to predict a straight line or curve to fit the real data points . The way to fit the data is to use the least square method —— Minimize the sum of squares of the difference between our predicted value and the real value .

Because it looks very basic and simple , Even with the above paragraph .

However , Why is the sum of squares ? Not to the power of one or three ? Because the first power will have positive and negative , Cannot express the actual distance ？ Then take the absolute value with you …… I didn't think about this problem carefully , It seems that the least square method is the most commonly used and classic way anyway , It's similar to finding an Euclidean distance. It's just a kind of expression that everyone likes to use .

But actually , We can explore the origin of the least square method from the perspective of probability and statistics , So as to prove its rationality .

Origin of least square method ：

Suppose we now have many sample points (x1,y1),(x2,y2),(x3,y3)……(xi,yi), We hope to predict a straight line ：

y=wx+b To fit these sample points .

Step by step , First of all, this b Is intercept , It will look troublesome behind you , The common way in machine learning is actually x Add a constant term to 1, And then put b“ add to ” To w in , In this way, the straight line can be written as ：

there θ More than the original w One more. b,x It is also better than the original x One more. 1, If you multiply it, there will be one more 1*b It's the original intercept .

So for each sample point , Our predictions y(i) by ：

It is known that xi The corresponding actual value is yi, Suppose the error yi-y(i) by εi, Next, let's start with this error term εi Expand the analysis ：

Now we have ：

First of all, make it clear ： Each data point has an error term εi, And these error terms obey the standard normal distribution （ The mean for 0 Standard deviation σ）. So bring in the normal distribution formula , We have ：

From the perspective of conditional probability , We hope that xi and θ In the case of combination yi Most likely to happen —— Is it very familiar , This is where we start using likelihood functions .（ The likelihood function was originally sorted out ）

We take every sample point into account （ Let them get tired ）, The likelihood function is ：

Now we hope to find a suitable θ Value maximizes this formula , The solution is very simple , Use the commonly used logarithmic method , You can get ：

Make this formula the largest , Remove a constant term , Equivalent to minimizing the following formula ：

In this way, we get the familiar least square method .

Application of least square method

Or for the example of fitting a straight line in a two-dimensional plane , We have decided to use the least square method , Then the target function is ：

Set the format of the line as y=wx+b, Expanded ：

To unite , It can be solved to get the answer ：

thus , We can almost get the conclusive answer of fitting a straight line in a two-dimensional plane .

Now let's take a simple concrete example ：

In a two-dimensional plane , There are three points , The values are as follows ：(1,1),(2,2),(3,4), Now it is required to predict a straight line to fit these data points .

（ Why three points ？ Because a point has no meaning , Two points determine a straight line , At three o'clock , We need to use the least square method to fit , So choose at least three points .）

Directly use the calculated conclusion , You know ：

w = [3*(1*1+2*2+3*4)-(1+2+3)(1+2+4)]/[3*(1*1+2*2+3*3)-(1+2+3)*(1+2+3)] = 3/2

b = (1+2+4)/3-3/2*(1+2+3)/3 = -2/3

So a straight line can be fitted ：y=3/2x-2/3, take x=1,2,3 Carry in checking calculation , It can be found that the fitting effect is really good .

The above is just a relatively simple application scenario , We can directly use a seemingly uncomplicated conclusion . Allied , We can also use the least square method to fit the conic （ At this time, we will not set f(x)=wx+b, It is y=w*x The square of +b*x The first power of +c, And then, respectively w,b,c Find the partial derivative and then solve the equation ）—— in other words , We can choose different f(x) type , Different fitting curves are obtained by the least square method .（ Of course , But when the situation is complicated , It is difficult for us to get a practical answer by solving the equation , At this time, we use gradient descent to directly optimize and approximate the results .）

To sum up , This article mainly combs the origin, calculation and application of the least square method , It is also used to make it convenient to review the past and know the new ~

原网站

版权声明
本文为[TranSad]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202151142593475.html

当前位置：网站首页>Understanding and application of least square method

Understanding and application of least square method

Origin of least square method ：

Application of least square method

边栏推荐

猜你喜欢

随机推荐