当前位置:网站首页>Introduction to deep learning (II) -- univariate linear regression
Introduction to deep learning (II) -- univariate linear regression
2022-07-03 05:07:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
One 、 The symbol States
- m: Represents the number of samples in the training set ;
- x: Represents the input variable ( Characteristic quantity ), Represents the characteristics of the input ;
- y: Represents the output variable ( Target variable ), That is our prediction ;
- (x,y): Represents a training sample ;
- (x(i),y(i)): To represent each training sample , We use x Superscript (i) and y Superscript (i) To express , It means the first one i Training samples ,i It's just an index , It means the number in the training set i That's ok , Not at all x and y Of i Power ;
- h:(hypothesis) Hypothesis function , Input x, Output y,h It is a slave. x To y Function mapping of .
Two 、 linear regression model (h(x))
The model formula of univariate linear regression :
3、 ... and 、 Cost function
In linear regression , We have a training set . What we need to do is get Ɵ0 and Ɵ1, Make the straight line represented by our hypothetical function fit these data points as much as possible . But how do we choose Ɵ0 and Ɵ1 Well ? Our idea is that choice can make h(x), That is input x Is the value of our prediction , Closest to the corresponding y Parameter of value Ɵ0 and Ɵ1.
Say abstractly : In linear regression problems , What we need to solve is a minimization problem , Write about Ɵ0 and Ɵ1 The minimization formula of , Give Way h(x) and y The difference between them is the smallest .
Cost function ( Squared error function ) The formula :
Without that 1/2, It's actually the variance formula , Combined with the 1/2 For the convenience of calculation
In this , Our goal translates into
How to get Ɵ0 and Ɵ1 Is the key
Four 、 gradient descent
Let's take a look at the following figure for later understanding
If we want to start from A Point to find the fastest J Value reduction direction ( Direction of gradient descent ), It's like trying to get down the mountain as fast as possible , Then you can take the black path in the picture ( notes : Different starting points and different paths )
The specific formula
among := It's assignment ,α It's the learning rate ( That is, how far is it to walk down the mountain )
α If it's too small, it takes a long time to walk down the mountain ,α If it is too large, the gradient decline may cross the lowest point , It may not even converge
Gradient descent algorithm can not only minimize the linear regression function J, You can also minimize other functions .
It is not necessary to use gradient descent algorithm to minimize the cost function , There's another algorithm —— Normal equation method (normal equation method), But gradient descent algorithm is more suitable for large data sets .
For univariate linear regression , The idea applied here is the least square method .
边栏推荐
- 1110 complete binary tree (25 points)
- Market status and development prospect prediction of the global forward fluorescent microscope industry in 2022
- Detailed explanation of the output end (head) of yolov5 | CSDN creation punch in
- 最大连续子段和(动态规划,递归,递推)
- "Hands on deep learning" pytorch edition Chapter II exercise
- Thesis reading_ Chinese NLP_ ELECTRA
- 1106 lowest price in supply chain (25 points)
- Sprintf formatter abnormal exit problem
- Interface frequency limit access
- Market status and development prospect prediction of global colorimetric cup cover industry in 2022
猜你喜欢
[basic grammar] Snake game written in C language
Thesis reading_ Chinese medical model_ eHealth
appium1.22.x 版本后的 appium inspector 需单独安装
Three representations of signed numbers: original code, inverse code and complement code
Flutter monitors volume to realize waveform visualization of audio
Shallow and first code
How to connect the network: Chapter 2 (Part 1): a life cycle of TCP connection | CSDN creation punch in
The programmer resigned and was sentenced to 10 months for deleting the code. JD came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
JS scope
Thesis reading_ Tsinghua Ernie
随机推荐
Literature reading_ Research on the usefulness identification of tourism online comments based on semantic fusion of multimodal data (Chinese Literature)
Market status and development prospect prediction of global fermentation acid industry in 2022
Learning record of arouter principle
Current market situation and development prospect forecast of global UV sensitive resin 3D printer industry in 2022
[set theory] relational power operation (relational power operation | examples of relational power operation | properties of relational power operation)
1118 birds in forest (25 points)
最大连续子段和(动态规划,递归,递推)
1115 counting nodes in a BST (30 points)
Learn to use the idea breakpoint debugging tool
Compile and decompile GCC common instructions
[research materials] the fourth quarter report of the survey of Chinese small and micro entrepreneurs in 2021 - Download attached
1107 social clusters (30 points)
[set theory] relationship properties (symmetry | symmetry examples | symmetry related theorems | antisymmetry | antisymmetry examples | antisymmetry theorems)
Retirement plan fails, 64 year old programmer starts work again
leetcode452. Detonate the balloon with the minimum number of arrows
Market status and development prospects of the global autonomous marine glider industry in 2022
sql语句模糊查询遇到的问题
Detailed explanation of the output end (head) of yolov5 | CSDN creation punch in
Celebrate the new year together
JS scope