当前位置:网站首页>1.4 regression of machine learning methods
1.4 regression of machine learning methods
2022-06-29 09:27:00 【Light wind and light clouds_ Cauchy】
1.3 The regression problem of machine learning methods
Regression analysis is used to predict the relationship between input variables and output variables , Especially when the value of the input variable changes , The value of the output variable also changes .


1. Linear regression
The linear regression algorithm assumes that the characteristics and results satisfy the linear relationship . This means that you can multiply the input by some constants , Add up the results to get the output .
Model
Select the fitting function form h θ ( x ) = ∑ i = 0 n θ i X i = θ ⊤ X h_{\theta}(x)=\sum_{i=0}^n{\theta_iX_i}=\theta^{\top}X hθ(x)=∑i=0nθiXi=θ⊤X
To describe the components in a feature , such as x 1 x_1 x1 The area of the room , x 2 x_2 x2 The orientation of the room , wait , Make an estimation function :
h ( x ) = h θ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 h(x) = h_{\theta}(x) = \theta_0 + \theta_1x_1 + \theta_2x_2 h(x)=hθ(x)=θ0+θ1x1+θ2x2Strategy
Determine the form of the loss function :
J ( θ ) = 1 2 ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta) = \frac{1}{2}\sum_{i=1}^m(h_{\theta}(x^{(i)})-y^{(i)})^2 J(θ)=21i=1∑m(hθ(x(i))−y(i))2
m i n θ J ( θ ) \underset{\theta}{min}J(\theta) θminJ(θ)Algorithm
Gradient descent method . First of all, θ \theta θ assignment , This value can be random , Can also let θ \theta θ It's an all zero vector .
change θ \theta θ Value , bring J ( θ ) J(\theta) J(θ) Decrease in the direction of gradient descent , The end of the algorithm will be in θ \theta θ Down to the point where it can't continue to fall .
1.1 Least square method
See 《 Linear regression least square method for machine learning 》
1.2 Ridge return
Ridge return ( English name :ridge regression, Tikhonov regularization) It is a biased estimation regression method for collinearity data analysis , In essence, it is an improved least squares estimation method , By giving up the unbiasedness of least squares , To lose some information 、 It is more practical to obtain the regression coefficient at the cost of reducing the accuracy 、 More reliable regression methods , The fitting of ill-conditioned data is better than the least square method .
1.3 Lasso Return to
Lasso Regression is a compressed estimate . It gets a more refined model by constructing a penalty function , Make it compress some coefficients , At the same time, set some coefficients to zero . Therefore, the advantage of subset contraction is retained , It is a biased estimator for processing data with complex collinearity .
Applicable scenario : Small sample size , But there are many indicators . Applicable to high-dimensional statistics , Traditional methods cannot deal with such data . also Lasso Feature selection can be carried out .
Fundamental theorem .Lasso Parameter estimation is defined as follows 

1.3.1 Back to the case : Financial revenue forecast of a city
On the basis of existing research Lasso The method of feature selection studies the factors that affect local fiscal revenue , stay Lasso Based on feature selection , Use support vector regression SVR Model , Regression analysis of the selected features , Get the prediction model of fiscal revenue . The case code is based on python+pandas+numpy+scikit-learn Realized .
Basic information of fiscal revenue data . Name of each feature : Number of social employees x1、 Total wages of on-the-job employees x2、 Total retail sales of consumer goods x3、 Per capita disposable income of urban residents x4、 Per capita consumption expenditure of urban residents x5、 Total Population x6、 Investment in fixed assets of the whole society x7、 Gross Regional Product x8、 Output value of primary industry x9、 tax x10、 Consumer price index x11、 The ratio of the output value of the tertiary industry to that of the secondary industry x12、 Consumption level of residents x13.





















边栏推荐
- PAT (Basic Level) Practice (中文)1003 我要通过! (20分) C语言实现
- Wechat applet determines the file format of URL
- Augfpn: improved multiscale feature learning for target detection
- 爱快安装或重置后,PC或手机端获取不到ip
- 微信小程序搜索关键字高亮和ctrl+f搜索定位实现
- GD32F4xx 以太網芯片(enc28j60)驅動移植
- programing language
- cmd进入虚拟机
- The former security director of Uber faced fraud allegations and concealed the data leakage event
- UE4 VS的Visual Assist插件设置
猜你喜欢

MySQL uses union all to count the total number of combinations of multiple tables and the number of tables respectively

SSD Improvement cfenet

Debugging H5 page -weinre and spy debugger real machine debugging

SSD改进CFENet

UE4 display 3D editable points in Viewport

Wechat applet sharing page, sharing to the circle of friends

js轮播图观后重做(较长的完整版,可运行)

Find the most repeated element in the string

五心公益红红娘团队

ThinkPHP 6 uses mongodb
随机推荐
Laravel 8 enables the order table to be divided by month level
How to implement observer mode
[to.Net] C data model, from Entity Framework core to LINQ
Network security issues
Augfpn: amélioration de l'apprentissage des caractéristiques à plusieurs échelles pour la détection des cibles
深卷积神经网络时代的目标检测研究进展
# 《网络是怎么样连接的》读书笔记 - WEB服务端请求和响应(四)
UE4 材质UV纹理不随模型缩放拉伸
H5 soft keyboard problem
Keras to tf Vgg19 input in keras_ shape
使用GPU训练kernel切换
微信小程序底部导航栏中间突出
Wechat applet determines the file format of URL
五心公益红红娘团队
Pytorch Summary - sensor on GPU
LFFD:一种用于边缘检测的轻量化快速人脸检测器
Simplicity Studio无法识别新买的JLink v9解决方法
PAT (Basic Level) Practice (中文)1003 我要通过! (20分) C语言实现
Uber 前安全主管面临欺诈指控,曾隐瞒数据泄露事件
超融合架构和传统架构有什么区别?