当前位置:网站首页>多元线性回归(sklearn法)
多元线性回归(sklearn法)
2022-07-05 08:42:00 【python-码博士】
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
# SVR LinearSVR 回归
# SVC LinearSVC 分类
# 流程
# 1. 获取数据
data = pd.read_csv('./data.csv')
# 2. 数据探索
# print(data.columns)
# print(data.describe())
# 3. 数据清洗
# 特征分为3组
features_mean = list(data.columns[2:12]) #平均值数据
features_se = list(data.columns[12:22]) #标准差数据
# ID列删除
data.drop('id',axis=1,inplace=True)
# 将B良性替换为0,M恶性替换为1
data['diagnosis'] = data['diagnosis'].map({
'M':1,'B':0})
print(data.head(5))
# 4. 特征选择
# 目的 降维
sns.countplot(data['diagnosis'],label='Count')
plt.show()
# 热力图features_mean 字段间的相关性
corr = data[features_mean].corr()
plt.figure(figsize=(14,14))
sns.heatmap(corr,annot=True)
plt.show()
# 特征选择 平均值这组 10--→6
features_remain = ['radius_mean', 'texture_mean', 'smoothness_mean', 'compactness_mean', 'symmetry_mean','fractal_dimension_mean']
# 模型训练
# 抽取30%数据作为测试集
train,test = train_test_split(data,test_size=0.3)
train_x = train[features_mean]
train_y = train['diagnosis']
test_x = test[features_mean]
test_y = test['diagnosis']
# 数据规范化
ss = StandardScaler()
train_X = ss.fit_transform(train_x)
test_X = ss.transform(test_x)
# 创建svm分类器
model = svm.SVC()
#参数
# kernel核函数选择
# 1.linear 线性核函数 数据线性可分情况下
# 2.poly 多项式核函数 将数据从低维空间映射到高维空间 但是参数比较多,计算量比较大
# 3.rbf 高斯核函数 将样本映射到高维空间 参数少 性能不错 默认
# 4.sigmoid sigmoid核函数 蛇精网络的映射中 SVM实现多层神经网络
# c目标函数的惩罚系数
# gamma 核函数系数 默认为样本特征数的倒数
# 训练数据
model.fit(train_x,train_y)
# 6. 模型评估
pred = model.predict(test_x)
print('准确率:',accuracy_score(test_y,pred))
边栏推荐
- Dynamic dimensions required for input: input, but no shapes were provided. Automatically overriding
- Redis实现高性能的全文搜索引擎---RediSearch
- 第十八章 使用工作队列管理器(一)
- PIP installation
- Halcon shape_ trans
- STM32 lights up the 1.8-inch screen under Arduino IDE
- 实例003:完全平方数 一个整数,它加上100后是一个完全平方数,再加上168又是一个完全平方数,请问该数是多少?
- Wheel 1:qcustomplot initialization template
- 轮子1:QCustomPlot初始化模板
- Bit operation related operations
猜你喜欢

Guess riddles (2)

Bluebridge cup internet of things basic graphic tutorial - GPIO output control LD5 on and off

Example 005: three numbers sorting input three integers x, y, Z, please output these three numbers from small to large.

Shift operation of complement
![[noi simulation] juice tree (tree DP)](/img/19/bc71e8dc3958e4cb87b31423a74617.png)
[noi simulation] juice tree (tree DP)

Numpy pit: after the addition of dimension (n, 1) and dimension (n,) array, the dimension becomes (n, n)

猜谜语啦(2)

Daily question - input a date and output the day of the year

How apaas is applied in different organizational structures

Business modeling of software model | stakeholders
随机推荐
Meizu Bluetooth remote control temperature and humidity access homeassistant
实例007:copy 将一个列表的数据复制到另一个列表中。
Example 008: 99 multiplication table
STM32 lights up the 1.8-inch screen under Arduino IDE
Affected tree (tree DP)
Several problems to be considered and solved in the design of multi tenant architecture
An enterprise information integration system
Basic number theory - fast power
Infected Tree(树形dp)
Bluebridge cup internet of things basic graphic tutorial - GPIO input key control LD5 on and off
Bluebridge cup internet of things basic graphic tutorial - GPIO output control LD5 on and off
Halcon blob analysis (ball.hdev)
猜谜语啦(7)
Shell script
[daily training] 1200 Minimum absolute difference
Chapter 18 using work queue manager (1)
Is the security account given by Yixue school safe? Where can I open an account
实例002:“个税计算” 企业发放的奖金根据利润提成。利润(I)低于或等于10万元时,奖金可提10%;利润高于10万元,低于20万元时,低于10万元的部分按10%提成,高于10万元的部分,可提成7.
MATLAB小技巧(28)模糊綜合評價
Classification of plastic surgery: short in long long long