当前位置:网站首页>Machine learning summary (I): linear regression, ridge regression, Lasso regression
Machine learning summary (I): linear regression, ridge regression, Lasso regression
2022-07-01 13:33:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Linear regression as a regression analysis technology , The dependent variable of its analysis belongs to continuous variable , If the dependent variable is transformed into a discrete variable , Turn it into a classification problem . Regression analysis belongs to supervised Study problem , This blog will focus on reviewing the knowledge points of standard linear regression , And the possible problems in linear regression are briefly discussed , Two variants of linear regression, ridge regression and Lasso Return to , Finally through sklearn The library simulates the whole regression process .
Directory structure
- General form of linear regression
- Possible problems in linear regression
- Over fitting problem and its solution
- Linear regression code implementation
- The return of the mountains and the future Lasso Return to
- Ridge regression and Lasso Regression code implementation
General form of linear regression
Possible problems in linear regression
- There are two ways to solve the minimum value of the loss function : Gradient descent method and normal equation , The comparison between the two is listed in the attached notes .
- Feature scaling : That is to normalize the characteristic data , There are two benefits of feature scaling , First, it can improve the convergence speed of the model , Because if the data difference between features is large , Take two characteristics , Take these two features as horizontal and vertical coordinates to draw contour map , Drawn is a flat ellipse , At this time, finding the gradient direction through the gradient descent method will eventually take a zigzag route perpendicular to the contour , The iteration speed becomes slower . But if the feature is normalized , The entire contour map will appear circular , The direction of the gradient points to the center of the circle , The iteration speed is much faster than the former . Second, it can improve the accuracy of the model .
- Learning rate α Selection of : If learning rate α Selection too small , It will lead to more iterations , The convergence rate slows down ; Learning rate α The selection is too large , It is possible to skip the optimal solution , Eventually, there is no convergence at all .
Over fitting problem and its solution
- problem : The following picture shows the fitting problem
- resolvent :(1): Discard some features that have little impact on our final prediction , Specific features that need to be discarded can be passed PCA Algorithm to achieve ;(2): Using regularization techniques , Keep all features , But reduce the parameters in front of the feature θ Size , Specifically, you can modify the form of loss function in linear regression , Ridge regression and Lasso This is what regression does .
Linear regression code example
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, discriminant_analysis, cross_validation
def load_data():
diabetes = datasets.load_diabetes()
return cross_validation.train_test_split(diabetes.data, diabetes.target, test_size=0.25, random_state=0)
def test_LinearRegression(*data):
X_train, X_test, y_train, y_test = data
# adopt sklearn Of linear_model Create a linear regression object
linearRegression = linear_model.LinearRegression()
# Training
linearRegression.fit(X_train, y_train)
# adopt LinearRegression Of coef_ Attribute gets the weight vector ,intercept_ get b Value
print(" The weight vector :%s, b The value of is :%.2f" % (linearRegression.coef_, linearRegression.intercept_))
# Calculate the value of the loss function
print(" The value of the loss function : %.2f" % np.mean((linearRegression.predict(X_test) - y_test) ** 2))
# Calculate the predicted performance score
print(" Predict performance scores : %.2f" % linearRegression.score(X_test, y_test))
if __name__ == '__main__':
# Acquired data set
X_train, X_test, y_train, y_test = load_data()
# Perform training and output prediction results
test_LinearRegression(X_train, X_test, y_train, y_test)Linear regression example output
The weight vector :[ -43.26774487 -208.67053951 593.39797213 302.89814903 -560.27689824
261.47657106 -8.83343952 135.93715156 703.22658427 28.34844354], b The value of is :153.07
The value of the loss function : 3180.20
Predict performance scores : 0.36The return of the mountains and the future Lasso Return to
The return of the mountains and the future Lasso The emergence of regression is to solve the over fitting of linear regression and solve it by normal equation method θ In the process of x Transpose times x The problem of irreversibility , These two kinds of regression achieve their goals by introducing regularization terms into the loss function , See the following figure for the comparison of the loss functions of the three :
among λ It is called regularization parameter , If λ The selection is too large , Will put all the parameters θ Minimize , Cause under fitting , If λ Selection too small , It will cause the over fitting problem to be solved improperly , therefore λ The selection of is a technical activity . The return of the mountains and the future Lasso The biggest difference of regression is that ridge regression introduces L2 Norm penalty term ,Lasso Regression introduces L1 Norm penalty term ,Lasso Regression can make many of the loss functions θ All become 0, This is better than ridge regression , Because the return of the ridge requires all θ All exist , In this way, the amount of calculation Lasso The return will be much smaller than the ridge return .
You can see ,Lasso The regression will eventually tend to a straight line , The reason is that there are many θ The values have all been 0, The ridge regression has a certain smoothness , Because of all the θ All values exist .
Ridge regression and Lasso Regression code implementation
Ridge regression code example
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, discriminant_analysis, cross_validation
def load_data():
diabetes = datasets.load_diabetes()
return cross_validation.train_test_split(diabetes.data, diabetes.target, test_size=0.25, random_state=0)
def test_ridge(*data):
X_train, X_test, y_train, y_test = data
ridgeRegression = linear_model.Ridge()
ridgeRegression.fit(X_train, y_train)
print(" The weight vector :%s, b The value of is :%.2f" % (ridgeRegression.coef_, ridgeRegression.intercept_))
print(" The value of the loss function :%.2f" % np.mean((ridgeRegression.predict(X_test) - y_test) ** 2))
print(" Predict performance scores : %.2f" % ridgeRegression.score(X_test, y_test))
# Test different α Effect of value on prediction performance
def test_ridge_alpha(*data):
X_train, X_test, y_train, y_test = data
alphas = [0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]
scores = []
for i, alpha in enumerate(alphas):
ridgeRegression = linear_model.Ridge(alpha=alpha)
ridgeRegression.fit(X_train, y_train)
scores.append(ridgeRegression.score(X_test, y_test))
return alphas, scores
def show_plot(alphas, scores):
figure = plt.figure()
ax = figure.add_subplot(1, 1, 1)
ax.plot(alphas, scores)
ax.set_xlabel(r"$\alpha$")
ax.set_ylabel(r"score")
ax.set_xscale("log")
ax.set_title("Ridge")
plt.show()
if __name__ == '__main__':
# Use default alpha
# Acquired data set
#X_train, X_test, y_train, y_test = load_data()
# Train and predict the results
#test_ridge(X_train, X_test, y_train, y_test)
# Use your own alpha
X_train, X_test, y_train, y_test = load_data()
alphas, scores = test_ridge_alpha(X_train, X_test, y_train, y_test)
show_plot(alphas, scores)Lasso Regression code example
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, discriminant_analysis, cross_validation
def load_data():
diabetes = datasets.load_diabetes()
return cross_validation.train_test_split(diabetes.data, diabetes.target, test_size=0.25, random_state=0)
def test_lasso(*data):
X_train, X_test, y_train, y_test = data
lassoRegression = linear_model.Lasso()
lassoRegression.fit(X_train, y_train)
print(" The weight vector :%s, b The value of is :%.2f" % (lassoRegression.coef_, lassoRegression.intercept_))
print(" The value of the loss function :%.2f" % np.mean((lassoRegression.predict(X_test) - y_test) ** 2))
print(" Predict performance scores : %.2f" % lassoRegression.score(X_test, y_test))
# Test different α Effect of value on prediction performance
def test_lasso_alpha(*data):
X_train, X_test, y_train, y_test = data
alphas = [0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]
scores = []
for i, alpha in enumerate(alphas):
lassoRegression = linear_model.Lasso(alpha=alpha)
lassoRegression.fit(X_train, y_train)
scores.append(lassoRegression.score(X_test, y_test))
return alphas, scores
def show_plot(alphas, scores):
figure = plt.figure()
ax = figure.add_subplot(1, 1, 1)
ax.plot(alphas, scores)
ax.set_xlabel(r"$\alpha$")
ax.set_ylabel(r"score")
ax.set_xscale("log")
ax.set_title("Ridge")
plt.show()
if __name__=='__main__':
X_train, X_test, y_train, y_test = load_data()
# Use default alpha
#test_lasso(X_train, X_test, y_train, y_test)
# Use your own alpha
alphas, scores = test_lasso_alpha(X_train, X_test, y_train, y_test)
show_plot(alphas, scores)Attach study notes
reference
- python War machine learning
- Andrew Ng Machine learning open class
- http://www.jianshu.com/p/35e67c9e4cbf
- http://freemind.pluskid.org/machine-learning/sparsity-and-some-basics-of-l1-regularization/#ed61992b37932e208ae114be75e42a3e6dc34cb3
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/131445.html Link to the original text :https://javaforall.cn
边栏推荐
- Introduction to reverse debugging PE structure input table output table 05/07
- Spark source code (V) how does dagscheduler taskscheduler cooperate with submitting tasks, and what is the corresponding relationship between application, job, stage, taskset, and task?
- Detailed explanation of leetcode reconstruction binary tree [easy to understand]
- JS变色的乐高积木
- 6. Wiper part
- Terminal identification technology and management technology
- Camp division of common PLC programming software
- During Oracle CDC data transmission, the CLOB type field will lose its value during update. There is a value before update, but
- spark源码阅读总纲
- Idea of [developing killer]
猜你喜欢

Explain IO multiplexing, select, poll, epoll in detail

The best landing practice of cave state in an Internet ⽹⾦ financial technology enterprise

Svg diamond style code

Jenkins+webhooks- multi branch parametric construction-

Professor Li Zexiang, Hong Kong University of science and technology: I'm wrong. Why is engineering consciousness more important than the best university?

简单的两个圆球loading加载

焱融看 | 混合云时代下,如何制定多云策略

MySQL六十六问,两万字+五十图详解!复习必备

Different test techniques

SAP intelligent robot process automation (IRPA) solution sharing
随机推荐
简单的两个圆球loading加载
盲盒NFT数字藏品平台系统开发(搭建源码)
leetcode 322. Coin change (medium)
Function test process in software testing
新手准备多少钱可以玩期货?农产品可以吗?
Camp division of common PLC programming software
Analysis report on the development prospect and investment strategic planning of China's wafer manufacturing Ⓔ 2022 ~ 2028
龙蜥社区开源 coolbpf,BPF 程序开发效率提升百倍
[machine learning] VAE variational self encoder learning notes
Google Earth engine (GEE) - Global Human Settlements grid data 1975-1990-2000-2014 (p2016)
Build a vc2010 development environment and create a tutorial of "realizing Tetris game in C language"
VM virtual machine configuration dynamic IP and static IP access
用命令行 给 apk 签名
leetcode 322. Coin Change 零钱兑换(中等)
Shangtang technology crash: a script written at the time of IPO
Jenkins+webhooks- multi branch parametric construction-
Investment analysis and prospect prediction report of global and Chinese dimethyl sulfoxide industry Ⓦ 2022 ~ 2028
Terminal identification technology and management technology
Different test techniques
商汤科技崩盘 :IPO时已写好的剧本