当前位置:网站首页>多元线性回归(sklearn法)
多元线性回归(sklearn法)
2022-07-05 08:42:00 【python-码博士】
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
# SVR LinearSVR 回归
# SVC LinearSVC 分类
# 流程
# 1. 获取数据
data = pd.read_csv('./data.csv')
# 2. 数据探索
# print(data.columns)
# print(data.describe())
# 3. 数据清洗
# 特征分为3组
features_mean = list(data.columns[2:12]) #平均值数据
features_se = list(data.columns[12:22]) #标准差数据
# ID列删除
data.drop('id',axis=1,inplace=True)
# 将B良性替换为0,M恶性替换为1
data['diagnosis'] = data['diagnosis'].map({
'M':1,'B':0})
print(data.head(5))
# 4. 特征选择
# 目的 降维
sns.countplot(data['diagnosis'],label='Count')
plt.show()
# 热力图features_mean 字段间的相关性
corr = data[features_mean].corr()
plt.figure(figsize=(14,14))
sns.heatmap(corr,annot=True)
plt.show()
# 特征选择 平均值这组 10--→6
features_remain = ['radius_mean', 'texture_mean', 'smoothness_mean', 'compactness_mean', 'symmetry_mean','fractal_dimension_mean']
# 模型训练
# 抽取30%数据作为测试集
train,test = train_test_split(data,test_size=0.3)
train_x = train[features_mean]
train_y = train['diagnosis']
test_x = test[features_mean]
test_y = test['diagnosis']
# 数据规范化
ss = StandardScaler()
train_X = ss.fit_transform(train_x)
test_X = ss.transform(test_x)
# 创建svm分类器
model = svm.SVC()
#参数
# kernel核函数选择
# 1.linear 线性核函数 数据线性可分情况下
# 2.poly 多项式核函数 将数据从低维空间映射到高维空间 但是参数比较多,计算量比较大
# 3.rbf 高斯核函数 将样本映射到高维空间 参数少 性能不错 默认
# 4.sigmoid sigmoid核函数 蛇精网络的映射中 SVM实现多层神经网络
# c目标函数的惩罚系数
# gamma 核函数系数 默认为样本特征数的倒数
# 训练数据
model.fit(train_x,train_y)
# 6. 模型评估
pred = model.predict(test_x)
print('准确率:',accuracy_score(test_y,pred))
边栏推荐
- 实例001:数字组合 有四个数字:1、2、3、4,能组成多少个互不相同且无重复数字的三位数?各是多少?
- Latex improve
- Mathematical modeling: factor analysis
- An enterprise information integration system
- Search data in geo database
- MATLAB skills (28) Fuzzy Comprehensive Evaluation
- Run menu analysis
- C# LINQ源码分析之Count
- Esp8266 interrupt configuration
- Example 002: the bonus paid by the "individual income tax calculation" enterprise is based on the profit commission. When the profit (I) is less than or equal to 100000 yuan, the bonus can be increase
猜你喜欢

Guess riddles (9)

Daily question - input a date and output the day of the year

Example 007: copy data from one list to another list.

Guess riddles (7)

Explore the authentication mechanism of StarUML

Sword finger offer 06 Print linked list from end to end

STM32 lights up the 1.8-inch screen under Arduino IDE
![[three tier architecture and JDBC summary]](/img/e0/13d48f2e59b445b9e28e38d45f492d.png)
[three tier architecture and JDBC summary]

Redis实现高性能的全文搜索引擎---RediSearch
实例001:数字组合 有四个数字:1、2、3、4,能组成多少个互不相同且无重复数字的三位数?各是多少?
随机推荐
图解八道经典指针笔试题
Hello everyone, welcome to my CSDN blog!
实例009:暂停一秒输出
剑指 Offer 05. 替换空格
【三层架构】
猜谜语啦(3)
One question per day - replace spaces
Esphone retrofits old fans
Guess riddles (5)
[nas1] (2021cvpr) attentivenas: improving neural architecture search via attentive sampling (unfinished)
Dynamic dimensions required for input: input, but no shapes were provided. Automatically overriding
猜谜语啦(7)
Example 002: the bonus paid by the "individual income tax calculation" enterprise is based on the profit commission. When the profit (I) is less than or equal to 100000 yuan, the bonus can be increase
Business modeling of software model | vision
猜谜语啦(11)
Chapter 18 using work queue manager (1)
leetcode - 445. Add two numbers II
[牛客网刷题 Day4] JZ55 二叉树的深度
U8g2 drawing
剑指 Offer 06. 从尾到头打印链表