当前位置:网站首页>Linear Regression 02---Boston Housing Price Prediction
Linear Regression 02---Boston Housing Price Prediction
2022-08-04 06:05:00 【I'm fine please go away thank you】
文章目录
写在最前 :
Learning from a blogger's blog,讲得很好,很细致.
Portal click here这个案例是以线性回归为模型预测的,The purpose is to find one线性函数,The parameters occupied by each feature,Finally, do this on the desired linear function模型评估.
一、获取数据
二、数据分析
2.1描述性统计分析
# 2.1描述性统计分析
# describe()is to return this set of datacount ,mean,std,min,max,and percentiles. .T是转置
data.describe().T
结论: 数据总共有506行,14个变量,而且这14个变量都有506个非空的float64类型的数值,i.e. all variables have no null value.
2.2 散点图分析
1. 先绘制一个
2. Draw the rest as well
plt.figure(figsize=(15,10.5)) //图像大小
plot_count = 1
for feature in list(data.columns)[1:13]: //把剩余13The graph of each feature is drawn cyclically
plt.subplot(3,4,plot_count) //表示三行四列,plot_countIndicates the position of each scatterplot
plt.scatter(data[feature],data['target'])
plt.xlabel(feature.replace('_',' ').title())
plt.ylabel('target')
plot_count += 1
plt.show()
图像:
三、数据处理
x = data.iloc[:,0:13] //DataFrame切割,切割前13列(That is, put the last columntarget社区)
y = data.iloc[:,13:14]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2,random_state=5 )
四、特征工程 标准化
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
五、机器学习 创建模型
六、模型评估
七、全部代码
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression #通过正规方程优化
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import SGDRegressor #suiji
from sklearn.linear_model import Ridge #岭回归
import matplotlib.pyplot as plt
import pandas as pd
# 1.获取数据
data = pd.read_csv('./data/boshidun.csv')
data
# 2.1描述性统计分析
# describe()is to return this set of datacount ,mean,std,min,max,and percentiles. .T是转置
data.describe().T
# 2.2 散点图分析
def drawing(x,y,xlabel):
plt.scatter(x,y)
plt.title('%s - House Prices'% xlabel)
plt.xlabel(xlabel)
plt.ylabel('House Prices')
plt.yticks(range(0,60,5))
plt.grid()
plt.show()
# 绘制变量CRIMand a scatterplot of the dependent variable
drawing(data['CRIM'],data['target'],'Urban Per Crime Rate')
plt.figure(figsize=(15,10.5))
plot_count = 1
for feature in list(data.columns)[1:13]:
plt.subplot(3,4,plot_count)
plt.scatter(data[feature],data['target'])
plt.xlabel(feature.replace('_',' ').title())
plt.ylabel('target')
plot_count += 1
plt.show()
x = data.iloc[:,0:13]
y = data.iloc[:,13:14]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2,random_state=5 )
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
estimator = LinearRegression()
estimator.fit(x_train, y_train)
# 5.2 Print the estimated coefficients for the parameter
print(estimator.coef_[0])
coeffcients = pd.DataFrame([x.columns, estimator.coef_[0]]).T
coeffcients
# 6.1 获取预测值
y_predict = estimator.predict(x_test)
# 6.2 计算MSE
mean_squared_error(y_pred=y_predict, y_true=y_test)
print('R-Squared: %.4f'% estimator.score(x_test, y_test))
y_predict = estimator.predict(x_test)
plt.figure()
plt.scatter( y_predict,y_test)
plt.xlabel('Actual Prices')
plt.ylabel('Predicted Prices')
plt.title('Actual Prices vs Predicted Prices')
plt.show()
边栏推荐
- 【树 图 科 技 头 条】2022年6月28日 星期二 伊能静做客树图社区
- 智能合约安全——delegatecall (1)
- (十一)树--堆排序
- 彻底搞懂箱形图分析
- flink-sql所有语法详解
- 安装dlib踩坑记录,报错:WARNING: pip is configured with locations that require TLS/SSL
- android基础 [超级详细android存储方式解析(SharedPreferences,SQLite数据库存储)]
- 两个APP进行AIDL通信
- thymeleaf中 th:href使用笔记
- TensorFlow2学习笔记:8、tf.keras实现线性回归,Income数据集:受教育年限与收入数据集