当前位置:网站首页>Introduction to deep learning (II) -- univariate linear regression
Introduction to deep learning (II) -- univariate linear regression
2022-07-03 05:07:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
One 、 The symbol States
- m: Represents the number of samples in the training set ;
- x: Represents the input variable ( Characteristic quantity ), Represents the characteristics of the input ;
- y: Represents the output variable ( Target variable ), That is our prediction ;
- (x,y): Represents a training sample ;
- (x(i),y(i)): To represent each training sample , We use x Superscript (i) and y Superscript (i) To express , It means the first one i Training samples ,i It's just an index , It means the number in the training set i That's ok , Not at all x and y Of i Power ;
- h:(hypothesis) Hypothesis function , Input x, Output y,h It is a slave. x To y Function mapping of .
Two 、 linear regression model (h(x))
The model formula of univariate linear regression :
3、 ... and 、 Cost function
In linear regression , We have a training set . What we need to do is get Ɵ0 and Ɵ1, Make the straight line represented by our hypothetical function fit these data points as much as possible . But how do we choose Ɵ0 and Ɵ1 Well ? Our idea is that choice can make h(x), That is input x Is the value of our prediction , Closest to the corresponding y Parameter of value Ɵ0 and Ɵ1.
Say abstractly : In linear regression problems , What we need to solve is a minimization problem , Write about Ɵ0 and Ɵ1 The minimization formula of , Give Way h(x) and y The difference between them is the smallest .
Cost function ( Squared error function ) The formula :
Without that 1/2, It's actually the variance formula , Combined with the 1/2 For the convenience of calculation
In this , Our goal translates into
How to get Ɵ0 and Ɵ1 Is the key
Four 、 gradient descent
Let's take a look at the following figure for later understanding
If we want to start from A Point to find the fastest J Value reduction direction ( Direction of gradient descent ), It's like trying to get down the mountain as fast as possible , Then you can take the black path in the picture ( notes : Different starting points and different paths )
The specific formula
among := It's assignment ,α It's the learning rate ( That is, how far is it to walk down the mountain )
α If it's too small, it takes a long time to walk down the mountain ,α If it is too large, the gradient decline may cross the lowest point , It may not even converge
Gradient descent algorithm can not only minimize the linear regression function J, You can also minimize other functions .
It is not necessary to use gradient descent algorithm to minimize the cost function , There's another algorithm —— Normal equation method (normal equation method), But gradient descent algorithm is more suitable for large data sets .
For univariate linear regression , The idea applied here is the least square method .
边栏推荐
- Coordinatorlayout appbarrayout recyclerview item exposure buried point misalignment analysis
- BIO、NIO、AIO区别
- Cross platform plug-in flutter for displaying local notifications_ local_ notifications
- The programmer resigned and was sentenced to 10 months for deleting the code. JD came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
- sql语句模糊查询遇到的问题
- [backtrader source code analysis 5] rewrite several time number conversion functions in utils with Python
- Go language interface learning notes
- Market status and development prospect prediction of global fermented plant protein industry in 2022
- Class loading mechanism (detailed explanation of the whole process)
- leetcode452. Detonate the balloon with the minimum number of arrows
猜你喜欢
Botu uses peek and poke for IO mapping
Celebrate the new year together
Interface frequency limit access
Thesis reading_ Tsinghua Ernie
M1 Pro install redis
[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)
Apache MPM model and ab stress test
Actual combat 8051 drives 8-bit nixie tube
RT thread flow notes I startup, schedule, thread
Handler understands the record
随机推荐
Market status and development prospect forecast of global heat curing adhesive industry in 2022
1114 family property (25 points)
Promise
[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
Notes | numpy-08 Advanced index
Literature reading_ Research on the usefulness identification of tourism online comments based on semantic fusion of multimodal data (Chinese Literature)
"Pthread.h" not found problem encountered in compiling GCC
Pan details of deep learning
1095 cars on campus (30 points)
Current market situation and development prospect prediction of global direct energy deposition 3D printer industry in 2022
JS string and array methods
First + only! Alibaba cloud's real-time computing version of Flink passed the stability test of big data products of the Institute of ICT
Notes | numpy-11 Array operation
document. The problem of missing parameters of referer is solved
1111 online map (30 points)
Class loading mechanism (detailed explanation of the whole process)
Shuttle + alluxio accelerated memory shuffle take-off
C language program ideas and several commonly used filters
1107 social clusters (30 points)
My first Smartphone