当前位置:网站首页>Detailed explanation of linear regression in machine learning
Detailed explanation of linear regression in machine learning
2022-07-01 10:24:00 【HUIM_ Wang】
A detailed explanation of linear regression of machine learning
Linear regression algorithm :
Focus on gradient descent algorithm .
How to evaluate the model : Loss function (lost function)
The simplest and most common loss function : Minimum mean square error (mse)
The formula is as follows :
Forecast house price data , Hypothetical model y=1, The final value 60132 No practical significance , But in contrast , The smaller the value, the better . The best minimum mean square deviation is as close to 0 Of , But according to the data sample , It can't be equal to 0
Suppose the house price is predicted , The eigenvalue is the area , The target value is house price , Need to fit a line , Calculate the weight m and b
Step one : hypothesis m=0, namely y=b, be b Is the only adjustable parameter , Using the minimum mean square deviation formula , Calculate the minimum mean square deviation , In this process, an optimal parameter is fitted b

I know from the ,“ The optimal ” Of b The value should be mse=612 Corresponding 241 near , In this case, it is more in line with the price of house prices .( This process needs to find the smallest one by one mse Corresponding b, More trouble , And given in the figure b It's from 1 Started looking for , Because of the uncertainty b Value , once b A negative value of is troublesome )
In order to find the most suitable one as soon as possible and accurately b, A new concept needs to be introduced : Learning rate (learning rate)
Let's look at a set of pictures first :
Need to find the smallest mse Corresponding b spot



No matter b Start from what value , It is necessary for the computer to dynamically find the minimum according to the trend of the line mse value , I.e. derivative ,

According to the picture above , The minimum derivative obtained is -8, Corresponding to the minimum mean square deviation , here b=241
For all that , At this time b It is still the lowest value we can observe with the naked eye , also b The change and value of are added manually f, Computers can't understand , We need to adjust according to some semaphores b The change of .
If the curve is steep , For example, figure b, The value of the slope will be very negative , The next prediction point is on the right side of the curve, not on the left ; If it's a picture a situation , The slope will be very large , Should let b Offset to the left , That is, if the slope is negative , Then the next guess point should move to the right , If it's a positive value , It should be moved to the left , Until the curve flattens out , The derivative is close to 0.
We mentioned the learning rate above (learning rate), It can be based on mse Yes b The derivative of , Find the most suitable one as soon as possible b, If the derivative is negative, the greater , Then it should change the more , If the derivative is smaller , The more close to 0, Then it should change the smaller .
The learning rate is a value , for example 0.0001, The learning rate is used in this way : this b= the previous b - The last derivative * Learning rate , So the cycle goes on , Can quickly find the closest 0 The derivative of is corresponding to b .
If the learning rate is smaller , for example 0.000001, that b The slower the value of changes , If the learning rate is higher , for example 0.1、0.8,b The faster the value changes . The slower the change, the more iterations are required , Large amount of computation , But in the end b The more accurate the value ; The faster the change, the less the calculation , however b Value may not be the best one .
Here's the picture : When the learning rate is 0.00001、0.001,0.01、0.1 when :




When learning rate by 0.2 when ,b Slowly approaching 245, The derivative slowly approaches 0
learning rate The value of cannot be too large , otherwise b It will only be farther and farther away from the right point
To simplify the model , The above is y=mx+b,(m=0) Under the circumstances , Find the adjustable parameter, that is, the weight b Value .( This is an extreme situation , Under normal circumstances m≠0)
So in general, we should be right m and b To derive separately :
Empathy , obtain :
边栏推荐
- STM32逆变器电源设计方案,基于STM32F103控制器[通俗易懂]
- C [byte array] and [hexadecimal string] mutual conversion - codeplus series
- 大佬们 有没有搞过sink分流写入clickhouse 或者其他数据库的操作。
- 关于OpenCV中图像的widthStep
- JD and Tencent renewed the three-year strategic cooperation agreement; The starting salary rose to 260000 yuan! Samsung sk of South Korea competes for salary increase to retain semiconductor talents;
- TC8:UDP_USER_INTERFACE_01-08
- 【MPC】②quadprog求解正定、半正定、负定二次规划
- Module 9: design e-commerce seckill system
- 442. 数组中重复的数据
- MySQL common commands
猜你喜欢

Who's still buying three squirrels

leetcode:111. Minimum depth of binary tree

关于#SQL#的问题,如何解决?

IDEA运行报错Command line is too long. Shorten command line for...

How to solve the problem of SQL?

CRC verification

Venv: directory structure of venv

【MPC】①二次规划问题MATLAB求解器quadprog

Wireshark TS | 快速重传和乱序之混淆

Matplotlib数据可视化基础
随机推荐
sql语句修改字段类型「建议收藏」
编写自己的who命令
内存泄漏定位工具之 valgrind 使用
Recommend a JSON visualization tool artifact!
Fried money, lost 10million.
MySQL interception_ MySQL method for intercepting strings [easy to understand]
客户端如何请求数据库?
亿学学堂帮个人开的证券账户安全吗?是不是有套路
12. Gateway new generation gateway
Initial experience of Flink, a mainstream real-time stream processing computing framework
SQLAchemy 常用操作
What a high commission! The new programmer's partner plan is coming. Everyone can participate!
SQL optimization - in and not in, exist
SQL Server列一相同的情况下,如何取列二的最大值,并重新生成表
About widthstep of images in opencv
Sleeping second brother...
推荐一款 JSON 可视化工具神器!
442. 数组中重复的数据
STM32逆变器电源设计方案,基于STM32F103控制器[通俗易懂]
新一代云原生数据库的设计与实践