当前位置：网站首页>Introduction to deep learning (II) -- univariate linear regression

Introduction to deep learning (II) -- univariate linear regression

2022-07-03 05:07:00 【TT ya】

Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .

One 、 The symbol States

m: Represents the number of samples in the training set ;
x: Represents the input variable （ Characteristic quantity ）, Represents the characteristics of the input ;
y: Represents the output variable （ Target variable ）, That is our prediction ;
（x,y）： Represents a training sample ;
（x（i）,y（i））： To represent each training sample , We use x Superscript （i） and y Superscript （i） To express , It means the first one i Training samples ,i It's just an index , It means the number in the training set i That's ok , Not at all x and y Of i Power ;
h：（hypothesis） Hypothesis function , Input x, Output y,h It is a slave. x To y Function mapping of .

Two 、 linear regression model （h(x)）

The model formula of univariate linear regression ：

3、 ... and 、 Cost function

In linear regression , We have a training set . What we need to do is get Ɵ0 and Ɵ1, Make the straight line represented by our hypothetical function fit these data points as much as possible . But how do we choose Ɵ0 and Ɵ1 Well ？ Our idea is that choice can make h(x), That is input x Is the value of our prediction , Closest to the corresponding y Parameter of value Ɵ0 and Ɵ1.

Say abstractly ： In linear regression problems , What we need to solve is a minimization problem , Write about Ɵ0 and Ɵ1 The minimization formula of , Give Way h(x) and y The difference between them is the smallest .

Cost function （ Squared error function ） The formula ：

Without that 1/2, It's actually the variance formula , Combined with the 1/2 For the convenience of calculation

In this , Our goal translates into

How to get Ɵ0 and Ɵ1 Is the key

Four 、 gradient descent

Let's take a look at the following figure for later understanding

If we want to start from A Point to find the fastest J Value reduction direction （ Direction of gradient descent ）, It's like trying to get down the mountain as fast as possible , Then you can take the black path in the picture （ notes ： Different starting points and different paths ）

The specific formula

among := It's assignment ,α It's the learning rate （ That is, how far is it to walk down the mountain ）

α If it's too small, it takes a long time to walk down the mountain ,α If it is too large, the gradient decline may cross the lowest point , It may not even converge

Gradient descent algorithm can not only minimize the linear regression function J, You can also minimize other functions .

It is not necessary to use gradient descent algorithm to minimize the cost function , There's another algorithm —— Normal equation method （normal equation method）, But gradient descent algorithm is more suitable for large data sets .

For univariate linear regression , The idea applied here is the least square method .

原网站

版权声明
本文为[TT ya]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202150625145636.html