当前位置:网站首页>多元线性回归(sklearn法)
多元线性回归(sklearn法)
2022-07-05 08:42:00 【python-码博士】
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
# SVR LinearSVR 回归
# SVC LinearSVC 分类
# 流程
# 1. 获取数据
data = pd.read_csv('./data.csv')
# 2. 数据探索
# print(data.columns)
# print(data.describe())
# 3. 数据清洗
# 特征分为3组
features_mean = list(data.columns[2:12]) #平均值数据
features_se = list(data.columns[12:22]) #标准差数据
# ID列删除
data.drop('id',axis=1,inplace=True)
# 将B良性替换为0,M恶性替换为1
data['diagnosis'] = data['diagnosis'].map({
'M':1,'B':0})
print(data.head(5))
# 4. 特征选择
# 目的 降维
sns.countplot(data['diagnosis'],label='Count')
plt.show()
# 热力图features_mean 字段间的相关性
corr = data[features_mean].corr()
plt.figure(figsize=(14,14))
sns.heatmap(corr,annot=True)
plt.show()
# 特征选择 平均值这组 10--→6
features_remain = ['radius_mean', 'texture_mean', 'smoothness_mean', 'compactness_mean', 'symmetry_mean','fractal_dimension_mean']
# 模型训练
# 抽取30%数据作为测试集
train,test = train_test_split(data,test_size=0.3)
train_x = train[features_mean]
train_y = train['diagnosis']
test_x = test[features_mean]
test_y = test['diagnosis']
# 数据规范化
ss = StandardScaler()
train_X = ss.fit_transform(train_x)
test_X = ss.transform(test_x)
# 创建svm分类器
model = svm.SVC()
#参数
# kernel核函数选择
# 1.linear 线性核函数 数据线性可分情况下
# 2.poly 多项式核函数 将数据从低维空间映射到高维空间 但是参数比较多,计算量比较大
# 3.rbf 高斯核函数 将样本映射到高维空间 参数少 性能不错 默认
# 4.sigmoid sigmoid核函数 蛇精网络的映射中 SVM实现多层神经网络
# c目标函数的惩罚系数
# gamma 核函数系数 默认为样本特征数的倒数
# 训练数据
model.fit(train_x,train_y)
# 6. 模型评估
pred = model.predict(test_x)
print('准确率:',accuracy_score(test_y,pred))
边栏推荐
- 第十八章 使用工作队列管理器(一)
- Latex improve
- 剑指 Offer 05. 替换空格
- Guess riddles (6)
- How apaas is applied in different organizational structures
- Run菜单解析
- Numpy pit: after the addition of dimension (n, 1) and dimension (n,) array, the dimension becomes (n, n)
- Wheel 1:qcustomplot initialization template
- 【NOI模拟赛】汁树(树形DP)
- MATLAB小技巧(28)模糊综合评价
猜你喜欢
Example 006: Fibonacci series
Example 007: copy data from one list to another list.
【三层架构及JDBC总结】
Run menu analysis
How to write cover letter?
An enterprise information integration system
Arduino burning program and Arduino burning bootloader
Guess riddles (3)
Example 004: for the day of the day, enter a day of a month of a year to judge the day of the year?
Halcon shape_ trans
随机推荐
How can fresh students write resumes to attract HR and interviewers
Example 001: the number combination has four numbers: 1, 2, 3, 4. How many three digits can be formed that are different from each other and have no duplicate numbers? How many are each?
Example 002: the bonus paid by the "individual income tax calculation" enterprise is based on the profit commission. When the profit (I) is less than or equal to 100000 yuan, the bonus can be increase
Wheel 1:qcustomplot initialization template
实例001:数字组合 有四个数字:1、2、3、4,能组成多少个互不相同且无重复数字的三位数?各是多少?
Arrangement of some library files
STM32---ADC
Example 003: a complete square is an integer. It is a complete square after adding 100, and it is a complete square after adding 168. What is the number?
Daily question - input a date and output the day of the year
Run menu analysis
Halcon shape_ trans
Reasons for the insecurity of C language standard function scanf
Infected Tree(树形dp)
【三层架构】
[noi simulation] juice tree (tree DP)
猜谜语啦(11)
Several problems to be considered and solved in the design of multi tenant architecture
Pytorch entry record
Guess riddles (6)
Halcon blob analysis (ball.hdev)