当前位置:网站首页>Learn from Li Mu, deep learning - linear regression and basic optimization function
Learn from Li Mu, deep learning - linear regression and basic optimization function
2022-07-28 19:18:00 【Cug- Wu Yanzu】
Linear regression and basic optimization function
Linear regression
background
Teacher Li Mu made a background introduction , Say the United States needs bidding to buy a house . So according to others' offer , Then how much is suitable for me . We need to predict the price trend . This requires linear regression .
simplified model
Set up a simplified model to say house price and number of rooms , The number of toilets , Living area related . Then the purpose of linear regression is to find out each x Ahead w Parameters and b constant .
Vector version
Expand to a n Dimension vector . Then the linear model becomes the vector version .
Single layer neural network
The linear model can be regarded as a single-layer neural network , Input is each argument x Output as house price o, Then each line represents the weight of this vector .
Loss function
How to measure the loss between the real value and the estimated value of training . Then we need a loss function to express . The commonly used loss function is the square loss .
Collection of training data
Collect some data as this training sample . The more the better , This can better fit the parameters .
Parameter learning
Then we need to calculate the average loss between the predicted value and the real value to measure the loss function . The goal is to minimize the loss function .
The explicit solution of linear function
Because this is a linear model , So the loss is a convex function . Then there is the optimal solution . But in machine learning , Don't care about functions with explicit solutions . Only care about complete NP Difficult problem .
summary

Basic optimization function
gradient descent
Choose the initial value of a parameter at random , Then solve the loss function about w Gradient of ( At this point, the fastest decline direction ). η \eta η It's a learning step . It's a super parameter , It's not good to be too short ( Very expensive computing resources , Computing gradients is the most expensive part of machine learning , So use less ), Too long is not good ( Too long, easy to fit and fast convergence ).

Here is random sampling b Samples to approximate the loss function . It is called small batch random gradient descent . All in learning , When calculating the loss function , I'll pick one batch Not the whole sample .
Similarly, the batch size should not be too large or too small .
detach
stay pytorch Each tensor in it needs to be derived from the graph detach Only when you come out can you turn into numpy
yield
One at a time x and y Until all return .
epoch
It means scanning the whole data epoch Time
with keyword
with Express try-finally It means . Presentation testing
There is another one from 0 Start implementing version , And using the toolkit to implement the version . In the video . It's important to understand , Test set generation , Definition of weights and parameters in network model definition of loss function , Definition of super parameters , Model training . And finally the process of losing the output of the function . After understanding clearly , All change is the same .
边栏推荐
- Xiaobai must see the development route of software testing
- The open source of "avoiding disease and avoiding medicine" will not go far
- Implementation of grayscale publishing with haproxy
- Introduction and advanced MySQL (III)
- How to use the white list function of the video fusion cloud service easycvr platform?
- If you want to learn software testing, where can you learn zero foundation?
- Swiftui swift forward geocoding and reverse geocoding (tutorial with source code)
- 11 年膨胀 575 倍,微信为何从“小而美”变成了“大而肥”?
- Applet applet jump to official account page
- Self cultivation of Electronic Engineers - when a project is developed
猜你喜欢

N32替换STM32,这些细节别忽略!

【雷达】基于核聚类实现雷达信号在线分选附matlab代码

C language (high-level) character function and string function + Exercise

uwb模块实现人员精确定位,超宽带脉冲技术方案,实时厘米级定位应用

BM16 delete duplicate elements in the ordered linked list -ii

架构实战营第8模块作业

BM11 链表相加(二)

From Bayesian filter to Kalman filter (I)

Introduction and advanced level of MySQL (II)

SQL审核工具自荐Owls
随机推荐
1、 My first wechat applet
Bm11 list addition (II)
Pytorch GPU yolov5 reports an error
AI has changed thousands of industries. How can developers devote themselves to the new "sound" state of AI voice
How many of the top ten test tools in 2022 do you master
Getting started with QT & OpenGL
Creating new projects and adding your own programs
11 年膨胀 575 倍,微信为何从“小而美”变成了“大而肥”?
How long does software testing take?
Easynlp Chinese text and image generation model takes you to become an artist in seconds
6-20 vulnerability exploitation proftpd test
2022年暑假ACM热身练习3(详细)
When unity customizes the editor, let the subclass inherit the inspector display effect of the parent class
How does the mqtt server built with emqx forward data and save it to the cloud database?
What if svchost.exe of win11 system has been downloading?
Introduction and advanced MySQL (7)
[data analysis] realize SVDD decision boundary visualization based on MATLAB
QT with line encoding output cout
How long does software testing training take?
“讳疾忌医”的开源走不远