当前位置:网站首页>多元线性回归(sklearn法)
多元线性回归(sklearn法)
2022-07-05 08:42:00 【python-码博士】
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
# SVR LinearSVR 回归
# SVC LinearSVC 分类
# 流程
# 1. 获取数据
data = pd.read_csv('./data.csv')
# 2. 数据探索
# print(data.columns)
# print(data.describe())
# 3. 数据清洗
# 特征分为3组
features_mean = list(data.columns[2:12]) #平均值数据
features_se = list(data.columns[12:22]) #标准差数据
# ID列删除
data.drop('id',axis=1,inplace=True)
# 将B良性替换为0,M恶性替换为1
data['diagnosis'] = data['diagnosis'].map({
'M':1,'B':0})
print(data.head(5))
# 4. 特征选择
# 目的 降维
sns.countplot(data['diagnosis'],label='Count')
plt.show()
# 热力图features_mean 字段间的相关性
corr = data[features_mean].corr()
plt.figure(figsize=(14,14))
sns.heatmap(corr,annot=True)
plt.show()
# 特征选择 平均值这组 10--→6
features_remain = ['radius_mean', 'texture_mean', 'smoothness_mean', 'compactness_mean', 'symmetry_mean','fractal_dimension_mean']
# 模型训练
# 抽取30%数据作为测试集
train,test = train_test_split(data,test_size=0.3)
train_x = train[features_mean]
train_y = train['diagnosis']
test_x = test[features_mean]
test_y = test['diagnosis']
# 数据规范化
ss = StandardScaler()
train_X = ss.fit_transform(train_x)
test_X = ss.transform(test_x)
# 创建svm分类器
model = svm.SVC()
#参数
# kernel核函数选择
# 1.linear 线性核函数 数据线性可分情况下
# 2.poly 多项式核函数 将数据从低维空间映射到高维空间 但是参数比较多,计算量比较大
# 3.rbf 高斯核函数 将样本映射到高维空间 参数少 性能不错 默认
# 4.sigmoid sigmoid核函数 蛇精网络的映射中 SVM实现多层神经网络
# c目标函数的惩罚系数
# gamma 核函数系数 默认为样本特征数的倒数
# 训练数据
model.fit(train_x,train_y)
# 6. 模型评估
pred = model.predict(test_x)
print('准确率:',accuracy_score(test_y,pred))
边栏推荐
- My university
- C语言标准函数scanf不安全的原因
- Sword finger offer 09 Implementing queues with two stacks
- Arduino+a4988 control stepper motor
- [NAS1](2021CVPR)AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling (未完)
- Guess riddles (9)
- [three tier architecture and JDBC summary]
- Yolov4 target detection backbone
- 猜谜语啦(11)
- 实例009:暂停一秒输出
猜你喜欢
随机推荐
每日一题——替换空格
Guess riddles (142)
Basic number theory - factors
Arduino operation stm32
如何写Cover Letter?
猜谜语啦(7)
Agile project management of project management
Warning: retrying occurs during PIP installation
轮子1:QCustomPlot初始化模板
Yolov4 target detection backbone
Wheel 1:qcustomplot initialization template
696. 计数二进制子串
[nas1] (2021cvpr) attentivenas: improving neural architecture search via attentive sampling (unfinished)
Classification of plastic surgery: short in long long long
Pytorch entry record
Search data in geo database
Halcon affine transformations to regions
Numpy 小坑:维度 (n, 1) 和 维度 (n, ) 数组相加运算后维度变为 (n, n)
猜谜语啦(9)
Example 001: the number combination has four numbers: 1, 2, 3, 4. How many three digits can be formed that are different from each other and have no duplicate numbers? How many are each?








