当前位置:网站首页>Detailed explanation of linear regression in machine learning
Detailed explanation of linear regression in machine learning
2022-07-01 10:24:00 【HUIM_ Wang】
A detailed explanation of linear regression of machine learning
Linear regression algorithm :
Focus on gradient descent algorithm .
How to evaluate the model : Loss function (lost function)
The simplest and most common loss function : Minimum mean square error (mse)
The formula is as follows :
Forecast house price data , Hypothetical model y=1, The final value 60132 No practical significance , But in contrast , The smaller the value, the better . The best minimum mean square deviation is as close to 0 Of , But according to the data sample , It can't be equal to 0
Suppose the house price is predicted , The eigenvalue is the area , The target value is house price , Need to fit a line , Calculate the weight m and b
Step one : hypothesis m=0, namely y=b, be b Is the only adjustable parameter , Using the minimum mean square deviation formula , Calculate the minimum mean square deviation , In this process, an optimal parameter is fitted b

I know from the ,“ The optimal ” Of b The value should be mse=612 Corresponding 241 near , In this case, it is more in line with the price of house prices .( This process needs to find the smallest one by one mse Corresponding b, More trouble , And given in the figure b It's from 1 Started looking for , Because of the uncertainty b Value , once b A negative value of is troublesome )
In order to find the most suitable one as soon as possible and accurately b, A new concept needs to be introduced : Learning rate (learning rate)
Let's look at a set of pictures first :
Need to find the smallest mse Corresponding b spot



No matter b Start from what value , It is necessary for the computer to dynamically find the minimum according to the trend of the line mse value , I.e. derivative ,

According to the picture above , The minimum derivative obtained is -8, Corresponding to the minimum mean square deviation , here b=241
For all that , At this time b It is still the lowest value we can observe with the naked eye , also b The change and value of are added manually f, Computers can't understand , We need to adjust according to some semaphores b The change of .
If the curve is steep , For example, figure b, The value of the slope will be very negative , The next prediction point is on the right side of the curve, not on the left ; If it's a picture a situation , The slope will be very large , Should let b Offset to the left , That is, if the slope is negative , Then the next guess point should move to the right , If it's a positive value , It should be moved to the left , Until the curve flattens out , The derivative is close to 0.
We mentioned the learning rate above (learning rate), It can be based on mse Yes b The derivative of , Find the most suitable one as soon as possible b, If the derivative is negative, the greater , Then it should change the more , If the derivative is smaller , The more close to 0, Then it should change the smaller .
The learning rate is a value , for example 0.0001, The learning rate is used in this way : this b= the previous b - The last derivative * Learning rate , So the cycle goes on , Can quickly find the closest 0 The derivative of is corresponding to b .
If the learning rate is smaller , for example 0.000001, that b The slower the value of changes , If the learning rate is higher , for example 0.1、0.8,b The faster the value changes . The slower the change, the more iterations are required , Large amount of computation , But in the end b The more accurate the value ; The faster the change, the less the calculation , however b Value may not be the best one .
Here's the picture : When the learning rate is 0.00001、0.001,0.01、0.1 when :




When learning rate by 0.2 when ,b Slowly approaching 245, The derivative slowly approaches 0
learning rate The value of cannot be too large , otherwise b It will only be farther and farther away from the right point
To simplify the model , The above is y=mx+b,(m=0) Under the circumstances , Find the adjustable parameter, that is, the weight b Value .( This is an extreme situation , Under normal circumstances m≠0)
So in general, we should be right m and b To derive separately :
Empathy , obtain :
边栏推荐
- Have you learned the necessary global exception handler for the project
- 数字藏品平台搭建需要注意哪些法律风险及资质?
- What legal risks and qualifications should be paid attention to when building a digital collection platform?
- CCNP Part XII BGP (IV)
- 【MPC】②quadprog求解正定、半正定、负定二次规划
- 渗透常用工具-Goby
- 【Laravel 】faker数据填充详解
- A quietly rising domestic software, low-key and powerful!
- CentOS configures discuz prompt, please check whether the MySQL module is loaded correctly
- Introduction of uniapp wechat applet components on demand
猜你喜欢

预制菜迎来“黄金时代”,谁能领跑下一个万亿市场

I like two men...

TC8:UDP_USER_INTERFACE_01-08

【邂逅Django】——(二)数据库配置

关于#SQL#的问题,如何解决?

What is cloud primordial? Will it be the trend of future development?

Wireshark TS | confusion between fast retransmission and out of sequence

Error: missing revert data in call exception

SQL SERVER2014删除数据库失败,报错偏移量0x0000...

日本教授起诉英特尔FPGA与SoC产品侵犯一项设计专利
随机推荐
Kotlin 协程调度切换线程是时候解开真相了
venv: venv 的目录结构
【黑马早报】俞敏洪称从来不看新东方股价;恒驰5将于7月开启预售;奈雪虚拟股票或涉嫌非法集资;7月1日起冰墩墩停产...
Suggest collecting | what to do when encountering slow SQL on opengauss?
Which securities company has a low, safe and reliable Commission for stock trading and account opening
I like two men...
华为HMS Core携手超图为三维GIS注入新动能
基于Matlab的开环Buck降压斩波电路Simulink仿真电路模型搭建
leetcode:111. 二叉树的最小深度
What should I learn in the zero foundation entry test? It's the most comprehensive. Just learn from it
In the new database era, don't just learn Oracle and MySQL
投稿开奖丨轻量应用服务器征文活动(5月)奖励公布
What is cloud primordial? Will it be the trend of future development?
Is it safe to do fund fixed investment on CICC securities?
C [byte array] and [hexadecimal string] mutual conversion - codeplus series
树莓派4B系统搭建(超详细版)
线程基础知识
Module 9: design e-commerce seckill system
TC8:UDP_USER_INTERFACE_01-08
数据库的增删改查问题