当前位置：网站首页>Learn from Li Mu, deep learning - linear regression and basic optimization function

Learn from Li Mu, deep learning - linear regression and basic optimization function

2022-07-28 19:18:00 【Cug- Wu Yanzu】

Linear regression and basic optimization function

Linear regression
Basic optimization function

Linear regression

background

Teacher Li Mu made a background introduction , Say the United States needs bidding to buy a house . So according to others' offer , Then how much is suitable for me . We need to predict the price trend . This requires linear regression .
Insert picture description here

simplified model

Set up a simplified model to say house price and number of rooms , The number of toilets , Living area related . Then the purpose of linear regression is to find out each x Ahead w Parameters and b constant .
Insert picture description here

Vector version

Expand to a n Dimension vector . Then the linear model becomes the vector version .
Insert picture description here

Single layer neural network

The linear model can be regarded as a single-layer neural network , Input is each argument x Output as house price o, Then each line represents the weight of this vector .
Insert picture description here

Loss function

How to measure the loss between the real value and the estimated value of training . Then we need a loss function to express . The commonly used loss function is the square loss .
Insert picture description here

Collection of training data

Collect some data as this training sample . The more the better , This can better fit the parameters .
Insert picture description here

Parameter learning

Then we need to calculate the average loss between the predicted value and the real value to measure the loss function . The goal is to minimize the loss function .
Insert picture description here

The explicit solution of linear function

Because this is a linear model , So the loss is a convex function . Then there is the optimal solution . But in machine learning , Don't care about functions with explicit solutions . Only care about complete NP Difficult problem .
Insert picture description here

summary

Insert picture description here

Basic optimization function

gradient descent

Choose the initial value of a parameter at random , Then solve the loss function about w Gradient of ( At this point, the fastest decline direction ). $\eta$ It's a learning step . It's a super parameter , It's not good to be too short ( Very expensive computing resources , Computing gradients is the most expensive part of machine learning , So use less ), Too long is not good ( Too long, easy to fit and fast convergence ).
Insert picture description here

Here is random sampling b Samples to approximate the loss function . It is called small batch random gradient descent . All in learning , When calculating the loss function , I'll pick one batch Not the whole sample .

Similarly, the batch size should not be too large or too small .
Insert picture description here

detach

stay pytorch Each tensor in it needs to be derived from the graph detach Only when you come out can you turn into numpy

yield

One at a time x and y Until all return .

epoch

It means scanning the whole data epoch Time

with keyword

with Express try-finally It means . Presentation testing

There is another one from 0 Start implementing version , And using the toolkit to implement the version . In the video . It's important to understand , Test set generation , Definition of weights and parameters in network model definition of loss function , Definition of super parameters , Model training . And finally the process of losing the output of the function . After understanding clearly , All change is the same .

原网站

版权声明
本文为[Cug- Wu Yanzu]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/196/202207130615560778.html