当前位置:网站首页>Machine learning summary (I): linear regression, ridge regression, Lasso regression
Machine learning summary (I): linear regression, ridge regression, Lasso regression
2022-07-01 13:33:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Linear regression as a regression analysis technology , The dependent variable of its analysis belongs to continuous variable , If the dependent variable is transformed into a discrete variable , Turn it into a classification problem . Regression analysis belongs to supervised Study problem , This blog will focus on reviewing the knowledge points of standard linear regression , And the possible problems in linear regression are briefly discussed , Two variants of linear regression, ridge regression and Lasso Return to , Finally through sklearn The library simulates the whole regression process .
Directory structure
- General form of linear regression
- Possible problems in linear regression
- Over fitting problem and its solution
- Linear regression code implementation
- The return of the mountains and the future Lasso Return to
- Ridge regression and Lasso Regression code implementation
General form of linear regression
Possible problems in linear regression
- There are two ways to solve the minimum value of the loss function : Gradient descent method and normal equation , The comparison between the two is listed in the attached notes .
- Feature scaling : That is to normalize the characteristic data , There are two benefits of feature scaling , First, it can improve the convergence speed of the model , Because if the data difference between features is large , Take two characteristics , Take these two features as horizontal and vertical coordinates to draw contour map , Drawn is a flat ellipse , At this time, finding the gradient direction through the gradient descent method will eventually take a zigzag route perpendicular to the contour , The iteration speed becomes slower . But if the feature is normalized , The entire contour map will appear circular , The direction of the gradient points to the center of the circle , The iteration speed is much faster than the former . Second, it can improve the accuracy of the model .
- Learning rate α Selection of : If learning rate α Selection too small , It will lead to more iterations , The convergence rate slows down ; Learning rate α The selection is too large , It is possible to skip the optimal solution , Eventually, there is no convergence at all .
Over fitting problem and its solution
- problem : The following picture shows the fitting problem
- resolvent :(1): Discard some features that have little impact on our final prediction , Specific features that need to be discarded can be passed PCA Algorithm to achieve ;(2): Using regularization techniques , Keep all features , But reduce the parameters in front of the feature θ Size , Specifically, you can modify the form of loss function in linear regression , Ridge regression and Lasso This is what regression does .
Linear regression code example
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, discriminant_analysis, cross_validation
def load_data():
diabetes = datasets.load_diabetes()
return cross_validation.train_test_split(diabetes.data, diabetes.target, test_size=0.25, random_state=0)
def test_LinearRegression(*data):
X_train, X_test, y_train, y_test = data
# adopt sklearn Of linear_model Create a linear regression object
linearRegression = linear_model.LinearRegression()
# Training
linearRegression.fit(X_train, y_train)
# adopt LinearRegression Of coef_ Attribute gets the weight vector ,intercept_ get b Value
print(" The weight vector :%s, b The value of is :%.2f" % (linearRegression.coef_, linearRegression.intercept_))
# Calculate the value of the loss function
print(" The value of the loss function : %.2f" % np.mean((linearRegression.predict(X_test) - y_test) ** 2))
# Calculate the predicted performance score
print(" Predict performance scores : %.2f" % linearRegression.score(X_test, y_test))
if __name__ == '__main__':
# Acquired data set
X_train, X_test, y_train, y_test = load_data()
# Perform training and output prediction results
test_LinearRegression(X_train, X_test, y_train, y_test)Linear regression example output
The weight vector :[ -43.26774487 -208.67053951 593.39797213 302.89814903 -560.27689824
261.47657106 -8.83343952 135.93715156 703.22658427 28.34844354], b The value of is :153.07
The value of the loss function : 3180.20
Predict performance scores : 0.36The return of the mountains and the future Lasso Return to
The return of the mountains and the future Lasso The emergence of regression is to solve the over fitting of linear regression and solve it by normal equation method θ In the process of x Transpose times x The problem of irreversibility , These two kinds of regression achieve their goals by introducing regularization terms into the loss function , See the following figure for the comparison of the loss functions of the three :
among λ It is called regularization parameter , If λ The selection is too large , Will put all the parameters θ Minimize , Cause under fitting , If λ Selection too small , It will cause the over fitting problem to be solved improperly , therefore λ The selection of is a technical activity . The return of the mountains and the future Lasso The biggest difference of regression is that ridge regression introduces L2 Norm penalty term ,Lasso Regression introduces L1 Norm penalty term ,Lasso Regression can make many of the loss functions θ All become 0, This is better than ridge regression , Because the return of the ridge requires all θ All exist , In this way, the amount of calculation Lasso The return will be much smaller than the ridge return .
You can see ,Lasso The regression will eventually tend to a straight line , The reason is that there are many θ The values have all been 0, The ridge regression has a certain smoothness , Because of all the θ All values exist .
Ridge regression and Lasso Regression code implementation
Ridge regression code example
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, discriminant_analysis, cross_validation
def load_data():
diabetes = datasets.load_diabetes()
return cross_validation.train_test_split(diabetes.data, diabetes.target, test_size=0.25, random_state=0)
def test_ridge(*data):
X_train, X_test, y_train, y_test = data
ridgeRegression = linear_model.Ridge()
ridgeRegression.fit(X_train, y_train)
print(" The weight vector :%s, b The value of is :%.2f" % (ridgeRegression.coef_, ridgeRegression.intercept_))
print(" The value of the loss function :%.2f" % np.mean((ridgeRegression.predict(X_test) - y_test) ** 2))
print(" Predict performance scores : %.2f" % ridgeRegression.score(X_test, y_test))
# Test different α Effect of value on prediction performance
def test_ridge_alpha(*data):
X_train, X_test, y_train, y_test = data
alphas = [0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]
scores = []
for i, alpha in enumerate(alphas):
ridgeRegression = linear_model.Ridge(alpha=alpha)
ridgeRegression.fit(X_train, y_train)
scores.append(ridgeRegression.score(X_test, y_test))
return alphas, scores
def show_plot(alphas, scores):
figure = plt.figure()
ax = figure.add_subplot(1, 1, 1)
ax.plot(alphas, scores)
ax.set_xlabel(r"$\alpha$")
ax.set_ylabel(r"score")
ax.set_xscale("log")
ax.set_title("Ridge")
plt.show()
if __name__ == '__main__':
# Use default alpha
# Acquired data set
#X_train, X_test, y_train, y_test = load_data()
# Train and predict the results
#test_ridge(X_train, X_test, y_train, y_test)
# Use your own alpha
X_train, X_test, y_train, y_test = load_data()
alphas, scores = test_ridge_alpha(X_train, X_test, y_train, y_test)
show_plot(alphas, scores)Lasso Regression code example
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model, discriminant_analysis, cross_validation
def load_data():
diabetes = datasets.load_diabetes()
return cross_validation.train_test_split(diabetes.data, diabetes.target, test_size=0.25, random_state=0)
def test_lasso(*data):
X_train, X_test, y_train, y_test = data
lassoRegression = linear_model.Lasso()
lassoRegression.fit(X_train, y_train)
print(" The weight vector :%s, b The value of is :%.2f" % (lassoRegression.coef_, lassoRegression.intercept_))
print(" The value of the loss function :%.2f" % np.mean((lassoRegression.predict(X_test) - y_test) ** 2))
print(" Predict performance scores : %.2f" % lassoRegression.score(X_test, y_test))
# Test different α Effect of value on prediction performance
def test_lasso_alpha(*data):
X_train, X_test, y_train, y_test = data
alphas = [0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]
scores = []
for i, alpha in enumerate(alphas):
lassoRegression = linear_model.Lasso(alpha=alpha)
lassoRegression.fit(X_train, y_train)
scores.append(lassoRegression.score(X_test, y_test))
return alphas, scores
def show_plot(alphas, scores):
figure = plt.figure()
ax = figure.add_subplot(1, 1, 1)
ax.plot(alphas, scores)
ax.set_xlabel(r"$\alpha$")
ax.set_ylabel(r"score")
ax.set_xscale("log")
ax.set_title("Ridge")
plt.show()
if __name__=='__main__':
X_train, X_test, y_train, y_test = load_data()
# Use default alpha
#test_lasso(X_train, X_test, y_train, y_test)
# Use your own alpha
alphas, scores = test_lasso_alpha(X_train, X_test, y_train, y_test)
show_plot(alphas, scores)Attach study notes
reference
- python War machine learning
- Andrew Ng Machine learning open class
- http://www.jianshu.com/p/35e67c9e4cbf
- http://freemind.pluskid.org/machine-learning/sparsity-and-some-basics-of-l1-regularization/#ed61992b37932e208ae114be75e42a3e6dc34cb3
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/131445.html Link to the original text :https://javaforall.cn
边栏推荐
- 香港科技大学李泽湘教授:我错了,为什么工程意识比上最好的大学都重要?
- Have you ever encountered the problem that flynk monitors the PostgreSQL database and checkpoints cannot be used
- Introduction to topological sorting
- About fossage 2.0 "meta force meta universe system development logic scheme (details)
- spark源码(五)DAGScheduler TaskScheduler如何配合提交任务,application、job、stage、taskset、task对应关系是什么?
- Reasons for MySQL reporting 1040too many connections and Solutions
- 焱融看 | 混合云时代下,如何制定多云策略
- 学历、长相、家境普通的人,未来的发展方向是什么?00后的职业规划都已经整得明明白白......
- Wave animation color five pointed star loader loading JS special effects
- 20个实用的 TypeScript 单行代码汇总
猜你喜欢

5G工业网关的科技治超应用 超限超重超速非现场联合执法

北斗通信模块 北斗gps模块 北斗通信终端DTU

开源者的自我修养|为 ShardingSphere 贡献了千万行代码的程序员,后来当了 CEO

Huawei HMS core joins hands with hypergraph to inject new momentum into 3D GIS

Svg diamond style code

Meta再放大招!VR新模型登CVPR Oral:像人一样「读」懂语音

Content Audit Technology

龙蜥社区开源 coolbpf,BPF 程序开发效率提升百倍

详细讲解面试的 IO多路复用,select,poll,epoll

minimum spanning tree
随机推荐
Flutter SQLite使用
进入前六!博云在中国云管理软件市场销量排行持续上升
10. Page layout, guess you like it
简单的两个圆球loading加载
JS discolored Lego building blocks
Simple two ball loading
1. Sum of two numbers: given an integer array num and an integer target value, please find the two integers whose sum is the target value target in the array and return their array subscripts
Yarn重启applications记录恢复
MySQL 66 questions, 20000 words + 50 pictures in detail! Necessary for review
学历、长相、家境普通的人,未来的发展方向是什么?00后的职业规划都已经整得明明白白......
Analysis report on the development pattern of China's smart emergency industry and the 14th five year plan Ⓠ 2022 ~ 2028
Asp.netcore利用dynamic简化数据库访问
Analysis report on the development trend and Prospect of new ceramic materials in the world and China Ⓐ 2022 ~ 2027
leetcode 322. Coin Change 零钱兑换(中等)
Judea pearl, Turing prize winner: 19 causal inference papers worth reading recently
String input function
图灵奖得主Judea Pearl:最近值得一读的19篇因果推断论文
流量管理技术
Global and Chinese polypropylene industry prospect analysis and market demand forecast report Ⓝ 2022 ~ 2027
Asp. NETCORE uses dynamic to simplify database access